Why ICS Troubleshooting Goes Wrong Under Pressure (And What to Do in the First Two Minutes)
The Troubleshooting Habit That Pressure Kills First
2026-Snapshot-Volume7
This newsletter is for educational purposes only. The frameworks and thinking approaches described here are starting points for developing professional judgment, not prescriptive procedures for any specific facility or application.
Knowing the Framework Isn’t the Hard Part
If you read my last LinkedIn post, you already know the mental chain: start with the process, work through instrumentation, control logic, and networks, and always ask what changed recently. The logic is straightforward. Most experienced practitioners nod immediately when they hear it described. They’ve seen it work. Some version of it is already in their mind.
So why do industrial problems still take days to solve when they should take hours?
Not because people are unaware of the chain. Because pressure makes them abandon it.
The Insight
Troubleshooting in a real facility isn’t a quiet, methodical exercise.
There’s usually someone asking for a status update.
A process that’s been down for two hours.
A manager who wants to know what’s being done.
In that environment, the mental chain competes with a set of very human forces that pull strongly in the other direction.
The first is what I think of as the urgency trap. When people are watching, you need to be visibly doing something.
Thinking in layers, asking questions, ruling things out methodically, none of that looks like action to someone standing in the doorway. What looks like action is hands-on equipment.
So that’s where people go, often before they’ve figured out which layer they’re actually in.
The second is the expertise trap. Everyone on a troubleshooting team gravitates toward the layer they know best.
The controls engineer opens the PLC.
The instrument tech reaches for the calibration kit.
The network administrator checks the switch.
Each of them is doing competent, legitimate work. But if the problem is in a different layer than the one they started in, that competent work just wastes time.
The third is the sunk cost trap. Once you’ve spent two hours in a layer, admitting it’s the wrong one feels like failure.
So instead of stepping back and asking whether you’re in the right place, the tendency is to go deeper.
Try one more thing.
Check one more parameter.
The investment of time creates pressure to justify itself, even when the evidence is pointing somewhere else.
None of these are signs of incompetent engineers. They’re signs of humans under pressure.
The mental chain doesn’t fail because people don’t understand it. It fails because the conditions of real troubleshooting work against it.
The Quick Win
Before the next troubleshooting session, try adding a two-minute step at the very beginning, before anyone touches anything.
Write down three things.
First, which layer is the symptom showing up in.
Second, which layer you’re planning to investigate first?
Third, why those two answers are the same or different.
That third question is the one that matters. If you can’t answer it clearly, you’re following instinct rather than the chain. And instinct almost always points toward the familiar layer, not the right one.
This process doesn’t have to be formal.
A whiteboard,
a notepad, and
a shared screen.
The act of writing it down, even briefly, creates a pause between the symptom and the response. That pause is where effective troubleshooting lives. It’s also where the urgency trap, the expertise trap, and the sunk cost trap are easiest to catch before they’ve already cost you hours.
From the Field
I was called in to help with a flow transmitter that had been drifting for weeks. The team had already replaced the sensor twice and was preparing to pull the wiring and start over. They were frustrated. They’d been at it for a long time, and each thing they tried made logical sense. But nothing had worked.
Before anyone reached for another work order, I asked if we could spend twenty minutes walking through the layers together. There was some reluctance. They’d already been through the process. But we slowed down anyway.
The process was stable.
The new sensor was reading accurately against a backup measurement.
The control logic hadn’t changed.
And then, when we got to the question of what else had changed recently, one of the technicians mentioned a VFD replacement nearby three weeks earlier. Almost as an aside. He didn’t think it was relevant.
It was entirely relevant. The old drive had been configured to minimize electrical noise on nearby signal wiring.
The replacement came in with factory defaults. The noise was coupling onto the signal cables and causing the drift, but only during certain load conditions that happened to align with specific shifts.
Nobody had connected those two things because they happened in separate work orders, managed by different crews.
What I remember most about that day isn’t the technical finding. It was the twenty minutes of hesitation before we slowed down.
The team was experienced, capable, and genuinely trying. But they’d been in the wrong layer for weeks because the pressure to do something visible had won out over the habit of checking the layers first, which ultimately led to mistakes and setbacks in their progress.
That’s the gap the mental chain is really trying to close.
Not a knowledge gap. A pressure gap.
Until next time,
Alana

