How Broken Is Your ICS, Really? A Framework for Honest Assessment
How to actually know where your system stands
2026-Insider-Volume5
⚠️This resource is for informational and educational purposes only. It is not a substitute for engineering judgment, legal advice, or site-specific analysis. Every control system environment is different. Readers assume responsibility for adapting and applying this material thoughtfully to their contexts.
This deep dive is the companion piece to my latest video:
Why Your Industrial Control System Is Broken: Configuration, Architecture, and Change. The video covers the three root causes.
This Insider takes the next step, giving you a practical framework for assessing where your system actually stands. You can watch it on YouTube or listen to the audio version as a podcast right here on Substack. Either way, I’d suggest starting there before diving in.
Watch on YouTube →
Listen on Substack → YOUR PODCAST LINK
What’s Inside This Issue:
Why recognizing the problem and actually knowing where your system stands are two very different things
A three-part framework for assessing configuration, architecture, and change governance health, with specific observational checks for each dimension
The four most common mistakes teams make when they try to evaluate their own systems, including the one that catches experienced teams off guard
Two downloadable tools to help you work through the assessment with your team and identify where to focus first
The Problem
Most people who work in industrial controls recognize the patterns.
Configuration drift.
Fragmented architecture.
Undocumented changes piling up over years.
You’ve probably nodded along to a description of these problems more than once.
But there’s a gap between recognizing the pattern and knowing where your system actually stands.
The challenge is that these problems don’t come with obvious warning labels. They develop gradually, and because they develop slowly, teams often lose the ability to see them clearly.
You’ve adapted to the system as it is. The workarounds feel normal. The sections of code nobody touches feel normal. The alarm noise that everyone has learned to mentally filter feels normal. You’re too close to it to see it clearly.
The question then arises: how do experienced practitioners effectively evaluate the health of a control system? Not in a formal engineering audit sense, but in the honest, working-professional sense that helps you figure out what you’re actually dealing with and where to put your energy first.
That’s what this piece is about.
The Framework
Think of a system health assessment as asking three distinct sets of questions, one for each dimension: configuration, architecture, and change. None of these require special tools or formal methodologies. They require honest observation and the willingness to sit with uncomfortable answers.
Assessing Configuration Health
The core question here is: does your system behave consistently, and does it behave the way your documentation says it should?
A useful starting point is to walk through similar assets and compare them.
Consider ten identical pumps across a facility.
Do they have the same tag naming convention?
The same alarm limits?
The same HMI faceplate behaviour?
Or have they accumulated differences over the years from different integrators, different projects, and different programmers who each did things slightly their own way?
The more variation you find across assets that should be identical, the more configuration drift has occurred.
Ask your team what the current alarm philosophy is.
Can anyone produce a document that describes it?
Does that document match how alarms are actually configured today?
This gap, between documented intent and actual configuration, is one of the most revealing things you can discover.
If the philosophy document says alarms should be set to reflect process risk, but the actual limits were entered during commissioning and never reviewed, you’re looking at drift that has accumulated one assumption at a time.
Look at user accounts and permissions.
Who has access to what?
When was that last reviewed?
In many systems, permissions accumulate across years of personnel changes without regular cleanup. That’s both a security signal and a governance signal.
It tells you something about whether anyone is actively managing the system’s configuration as a whole or whether it’s just growing in whatever direction individual projects push it.
Also look at HMI consistency.
Do screens from different eras of the facility feel like they belong to the same system?
Or can you tell, just by looking, which parts were built by different contractors in different years?
Inconsistency in the human interface reflects inconsistency in the logic behind it.
None of these checks require taking anything offline. They’re observational. But together, they tell you a lot about whether your configuration is something you can actually trust as a stable baseline.


