Episode 53 — Conduct Architecture Reviews for OT Risk: Data Flows, Trust Boundaries, and Weak Links

In this episode, we’re going to treat architecture reviews as one of the most practical ways to reduce risk in Operational Technology (O T) without turning the environment into a constant project. Many beginners imagine security as a set of rules you apply to devices one by one, like adding a lock to every door in a city, but architecture is more like deciding where the city has gates, where the roads go, and which neighborhoods should never be directly connected. In O T, architecture is the shape of the environment, including how systems talk to each other, where people and vendors connect in, and how changes move from engineering tools into running processes. An architecture review is not about finding one perfect design; it is about seeing how risk could travel through your design and then adjusting the design so risk has fewer paths and smaller impact. This matters because many O T incidents do not happen because a single device was weak, but because the environment allowed a small problem to spread into a big one. When you learn to review architecture thoughtfully, you learn to see risk as movement, boundaries, and weak links.

Before we continue, a quick note: this audio course is a companion to our course companion books. The first book is about the exam and provides detailed information on how to pass it best. The second book is a Kindle-only eBook that contains 1,000 flashcards that can be used on your mobile device or Kindle. Check them both out at Cyber Author dot me, in the Bare Metal Study Guides Series.

A helpful way to define an architecture review is to say it is a structured look at how the system is built and how it behaves, especially under abnormal conditions. You are looking at components, connections, and responsibilities, but you are also looking at assumptions, such as which systems are trusted, which connections are allowed, and which actions are considered normal. In O T, architecture reviews must respect safety and uptime requirements, which means you are not trying to redesign everything at once, and you are not trying to enforce office-style patterns that ignore operational constraints. Instead, you are trying to identify the few design choices that drive most of the risk, like uncontrolled remote access, flat networks, shared engineering tools across zones, or unclear data flow directions. Beginners sometimes expect architecture reviews to be highly technical and full of diagrams, but the most important part is actually the questions you ask about how things connect and what could happen if something goes wrong. Good reviews focus on how a compromise or failure could move from one area to another, because that movement determines both likelihood and consequence. Once you can describe movement, you can decide where to place boundaries and controls that are realistic.

To do this well, you have to understand data flows, which are simply the paths that information takes between systems. In O T, data flows can include control signals between controllers and field devices, status and alarm data to operator stations, configuration changes from engineering workstations to controllers, and reporting data from a historian to business analytics. Data flows can be one-way or two-way, and they can be continuous or occasional, like during maintenance windows. Beginners often think of data as something passive, but in O T, data can cause action, because a command, a setpoint, or a logic update is data that changes physical behavior. That means data flow design is directly tied to risk, because the more pathways exist for commands and changes, the more ways exist for something to go wrong. A strong architecture review asks what data must flow for operations to work, and then asks what data should never flow, like commands flowing backward from business systems into control zones. When you make data flows explicit, you can align them with controls rather than letting them exist by accident.

Trust boundaries are the next key idea, because trust is the assumption that something will behave as expected, and boundaries are where that assumption should change. In O T, a trust boundary might exist between business networks and control networks, between a vendor access zone and a safety-related zone, or between an operator station area and engineering tools that can modify logic. Beginners sometimes assume internal equals trusted and external equals untrusted, but modern environments make that simplistic view unsafe, because internal systems can be compromised and external partners can be legitimate. A trust boundary is therefore not a moral judgment, it is a control point where you enforce rules, verify identity, and monitor activity because trust could be wrong. In architecture reviews, trust boundaries matter because they are the places where you can reduce risk most efficiently, by limiting what crosses the boundary and by requiring extra assurance when something must cross. If the boundary is poorly defined, you get unintended pathways, like shared services that bridge zones or remote sessions that land too deep inside sensitive areas. A review that identifies trust boundaries clearly creates a map of where defensive effort should be concentrated.

Once you see data flows and trust boundaries, you can start looking for weak links, which are design choices that allow disproportionate harm if they fail. A weak link might be a single remote access gateway that reaches many critical systems, a shared engineering workstation that touches multiple zones, or a flat network where compromise of one device can lead to access everywhere. Weak links are dangerous in O T because they increase blast radius, meaning the potential spread of an incident, and they also increase complexity, meaning people struggle to respond quickly and safely. Beginners sometimes equate weak links with old hardware, but weak links are more often about connectivity and privilege, not age. A modern system can be a weak link if it has broad access and weak oversight, while an older system can be relatively safe if it is well isolated and its interfaces are tightly controlled. Architecture reviews aim to find weak links and then decide how to strengthen them, often by reducing unnecessary connectivity, reducing privileges, adding monitoring, or adding redundancy. The key is to look for places where the environment depends on one fragile component or one fragile assumption. When you find those, you have found the spots where architecture changes can deliver major risk reduction.

A practical architecture review also considers zones of similar purpose and risk, even if you do not use formal terminology. In O T, it is common to separate areas like the business interface area, the supervisory control area, the control area, and the safety-related area, because each area has different risk tolerance and different needs. The point is not to force every site into the same diagram, but to group assets based on function and consequence so controls can be applied consistently. If critical controllers and non-critical workstations share the same zone and talk freely, the architecture makes it too easy for issues to move into critical paths. If a vendor support pathway reaches both critical and non-critical systems without distinction, the architecture assumes that vendor access is always safe, which is not a defendable assumption. Beginners often think segmentation is a single product or a single rule set, but segmentation is an architectural choice about where boundaries exist and what is allowed across them. In a review, you ask whether the zones reflect reality, whether conduits between zones are controlled, and whether the rules match the minimum data flow needed. When zones are clear and consistent, both security and operations become more predictable.

Remote access is one of the most common areas where architecture reviews find risk, because remote access often grows over time without a clean design plan. O T environments may have remote access for vendors, for centralized engineering, for monitoring, or for management, and each pathway can land in different places and carry different privileges. The architecture question is not only whether remote access exists, but where it terminates and what it can reach from that termination point. If remote access lands on an engineering workstation inside a sensitive zone, the pathway inherits the workstation’s reach, which can be very broad. If remote access lands in a controlled boundary zone and then requires additional approvals or session controls to reach deeper systems, the architecture reduces exposure while still supporting operations. Beginners sometimes focus on authentication alone, but architecture reviews look beyond authentication to lateral movement, meaning what else becomes reachable once someone is inside. You also consider how monitoring works, because a remote session that is not observable creates an investigation gap and encourages unsafe assumptions during response. A strong design makes remote access visible, limited, and time-bound, not sprawling and permanent.

Engineering tools and management planes deserve special attention because they often have the power to change how the process behaves. An engineering workstation, a configuration management server, or a central management console might be used only occasionally, but when it is used, it can push changes to many devices. That makes these tools high impact, and in architecture terms, it makes them critical nodes with broad reach. A review should ask where these tools live, which zones they can reach, how credentials are managed, and how changes are documented and verified. Beginners often imagine that the most dangerous systems are the controllers, but controllers are frequently manipulated through engineering tools, which means protecting the tool can protect many controllers at once. You also consider separation of duties, meaning whether the same access path is used for everyday monitoring and for privileged configuration, because mixing those functions can turn minor access into major control. Monitoring for configuration changes is part of architecture too, because it determines whether the environment can detect integrity problems quickly. When the management plane is designed with clear boundaries, the environment becomes less fragile and more explainable.

Data historians and reporting paths are another common source of confusion, because they sit between O T and business needs. A historian often collects process data from control systems and then shares it with reporting and analytics systems, and that sharing can be valuable for performance and planning. The risk arises when the architecture unintentionally creates a reverse pathway, where business systems can influence control systems, or when the historian becomes a bridge that carries more than data, such as credentials, management access, or shared services. Architecture reviews ask about directionality, meaning which direction data is intended to flow, and they ask about enforceability, meaning whether the design actually prevents unintended reverse flow. Beginners sometimes assume that if a connection exists, it is automatically bidirectional, but architectures can be designed to enforce one-way behavior where appropriate. You also consider what happens if the historian is compromised, because it often has broad visibility and can be a launching point if not isolated. A good review identifies which data flows are essential, which are optional, and which should be redesigned to reduce exposure. When reporting paths are cleanly bounded, the organization can gain business insight without expanding the threat surface unnecessarily.

Identity and authentication can feel like a separate topic, but they are deeply architectural in O T because identity systems often span many zones. If a single authentication service supports both business and O T access, compromise of that service can become a widespread access problem. If shared accounts exist because identity integration is hard, those shared accounts become architectural weak links because they remove accountability and enable broad misuse. An architecture review looks at where identity lives, how it is connected, and what happens if identity services are unavailable during an incident. Beginners often think stronger authentication always reduces risk, but if authentication design is fragile, it can increase consequence by blocking legitimate response during emergencies. That is why architecture reviews consider both security and resilience, such as whether critical access paths have safe fallback methods that are controlled and auditable. You also consider how third parties authenticate, because vendor identity practices can become part of your architecture when they connect into your environment. The goal is to design identity so it supports least privilege, accountability, and reliable operations, not just login prompts. When identity is designed as part of architecture, access control stops being a patchwork of local decisions.

Another important part of architecture reviews is understanding the difference between normal operation flows and maintenance flows. Many O T environments have reasonably controlled day-to-day operations, but during maintenance windows, upgrades, troubleshooting, and commissioning, extra connections appear, temporary access is granted, and unusual tools are introduced. Those maintenance flows can become the real threat surface if they are not designed and controlled. A review should therefore ask how maintenance is performed, where laptops connect, how files move, how vendor sessions are initiated, and how temporary changes are removed afterward. Beginners sometimes assume risk is constant, but in many environments risk spikes during change activity, because the environment is more open and people are under time pressure. Architecture can reduce these spikes by creating dedicated maintenance zones, dedicated transfer methods, and clear entry points that remain controlled even when work is urgent. You also consider how evidence is captured during maintenance, because knowing what changed is critical for safe recovery. When maintenance flows are built into architecture, the environment stays predictable even during abnormal work. That predictability reduces both cyber likelihood and operational chaos.

Architecture reviews should also include resilience questions, because a design that is secure but unrecoverable is not safe in O T. You want to know where backups exist, what configurations are critical to restore, and what dependencies must be available to bring systems back safely. You also want to identify single points of failure in network paths, management tools, and shared services, because those single points can turn a small incident into a prolonged outage. Beginners sometimes treat backups as an operational detail, but recovery capability is a risk control because it reduces consequence. In architecture terms, recovery depends on where backups are stored, how restores are performed, and how you verify that restored systems are correct, not just running. A review considers whether recovery assets are protected from the same failures that could affect production, because storing recovery tools in the same fragile zone can be a design weakness. It also considers whether incident response requires coordination across boundaries, and whether those boundaries allow safe isolation without breaking critical safety functions. When resilience is designed into architecture, the organization gains options during incidents, and options are what prevent panic-driven decisions.

Finally, a strong architecture review ends with a small number of clear, defensible improvements that reduce risk by changing how the environment is shaped. You might reduce unnecessary connectivity, tighten a boundary where trust shifts, redesign a remote access landing point, separate management functions from monitoring functions, or clarify data flow directionality so business needs do not create control risk. The most important idea for beginners is that architecture changes often produce the biggest risk reduction because they change many scenarios at once, rather than fixing one vulnerability on one device. Architecture reviews also improve communication, because they give security, engineering, and operations a shared picture of how systems interact and where risk can move. When that picture is clear, controls become less about arguing and more about aligning the design with safety, reliability, and realistic constraints. Over time, repeated architecture reviews help the environment stay coherent as it evolves, preventing slow drift into uncontrolled pathways and hidden weak links. Conducting architecture reviews for O T risk is therefore not a one-time project, but a disciplined habit of keeping data flows intentional, trust boundaries enforceable, and weak links strengthened before they become failure points.

Episode 53 — Conduct Architecture Reviews for OT Risk: Data Flows, Trust Boundaries, and Weak Links
Broadcast by