Episode 24 — Place OT Workloads in Cloud and Edge: Public, Private, Hybrid, and Vendor Services

In this episode, we’re going to unpack what it really means when people say they want to put O T workloads in the cloud or at the edge, and why those choices can be both helpful and risky at the same time. For a brand-new learner, it can feel like the cloud is just a faraway computer where things run, and the edge is just a nearby computer where things run, so why all the fuss. The fuss exists because O T systems are connected to real processes that demand reliable timing, stable operation, and strong safety expectations, and moving workloads changes the way those expectations are met. You might move a workload to reduce maintenance effort, improve visibility across sites, or use vendor services that promise faster updates and better analytics. You might also move a workload because hardware is aging and replacement feels easier if it becomes a managed service. But every move changes trust boundaries, failure modes, and recovery options, and it can also change who has access to what data and what control paths exist. By the end, you should be able to describe public, private, and hybrid cloud choices, explain what edge computing is doing in O T, and evaluate common tradeoffs without needing to memorize specific providers.

Before we continue, a quick note: this audio course is a companion to our course companion books. The first book is about the exam and provides detailed information on how to pass it best. The second book is a Kindle-only eBook that contains 1,000 flashcards that can be used on your mobile device or Kindle. Check them both out at Cyber Author dot me, in the Bare Metal Study Guides Series.

A good way to begin is to define what we mean by an O T workload, because we are not usually talking about pushing the control loop itself into a distant data center. In many environments, the most time-sensitive control functions stay close to the equipment, because even small communication delays can affect stability. The workloads that tend to move are supporting functions, like historian storage, reporting dashboards, condition monitoring analytics, fleet-wide visibility, patch and asset management systems, remote access gateways, and data pipelines that collect and normalize information from many sites. Some of these are close to operations, but not part of the immediate safety-critical loop. That distinction is important because it shapes what risks are acceptable. If you move something that supports decision-making, you might tolerate brief outages or delayed data as long as operations can continue safely. If you move something that is part of control, delays and outages can turn into process disruptions quickly. So when you hear “move O T to cloud,” a beginner-safe translation is “move certain supporting functions and data services, while keeping the most time-critical control as local as necessary.”

Now let’s define the major cloud models in plain language. A public cloud is a shared infrastructure environment operated by a provider, where many customers use the same physical resources but are separated logically. A private cloud is a cloud-like environment dedicated to a single organization, which might be hosted in that organization’s data center or by a provider, but it is not shared with unrelated customers in the same way. A hybrid cloud combines both, often keeping certain workloads on private infrastructure while using public cloud services for other workloads, and connecting them through secure networking. In O T, hybrid is common because organizations want some of the flexibility of public cloud analytics or managed services, but they also want local control for sensitive systems and predictable operations. These are not just business decisions, because they affect how identity, logging, security controls, and incident response work. A beginner should focus on the idea that each model places workloads in a different trust and dependency environment, and the environment influences both security and reliability.

Edge computing is the other key piece, and it is best understood as computing that happens near where data is produced or where actions are taken. In O T, the edge might be a small industrial computer near equipment, a server in a plant data center, or a site-level platform that collects data locally and then forwards summaries or selected data to a central location. The reason edge computing exists is that many industrial networks have limited bandwidth, strict latency requirements, and intermittent connectivity to the outside world. Edge systems can preprocess data, filter noise, perform local analytics, and keep operations running even when the link to a central system is unavailable. For example, an edge node might collect sensor readings, calculate health indicators, and store local history so that operators can still see trends even if the wide-area connection is down. From a security perspective, edge computing can reduce exposure by keeping sensitive data local and minimizing what leaves the site. But it also creates more distributed assets that must be managed and protected, which can be challenging when sites are remote and staffed differently.

One of the most important tradeoffs when placing workloads in cloud or edge is latency, which is the time it takes for data to travel and for decisions to return. In security discussions, people often focus on confidentiality and access, but in O T, latency is part of safety and reliability because time matters. If a monitoring system depends on cloud processing and the connection slows down, alarms might arrive late, and late alarms can lead to late reactions. That does not mean cloud is wrong, but it means the architecture must respect which functions can tolerate delay and which cannot. A second related concept is bandwidth, because raw industrial data can be extremely large, and sending everything to the cloud is often unrealistic or expensive. That is why edge filtering and summarization are popular, because the edge can decide what is worth sending, which also limits data exposure. Beginners should remember that cloud and edge choices are not only about cost or convenience, they are also about physics, because networks have real limits and industrial processes have real timing needs.

Another major tradeoff is availability and dependency, meaning what happens when connectivity is lost or when a provider service has an outage. In a traditional local environment, if an external connection fails, the plant may continue running because it is self-contained. When you move workloads into cloud services, you add a dependency on external connectivity and on the provider’s service health. If a cloud dashboard is unavailable, operations might still be fine, but visibility is reduced, and reduced visibility can increase risk during abnormal conditions. If a cloud-based identity system is part of authentication to critical services, an outage could prevent legitimate users from accessing tools they need during an incident. That is why architectures often include local fallback modes, cached credentials, or site-level capabilities that keep core functions running. In O T, the safest designs treat connectivity loss as a normal possibility, not a rare disaster, and they ensure operations remain safe and stable even when cloud services are temporarily unreachable.

We also need to talk about identity and access control because moving workloads changes who can reach them and how access is granted. When a workload is local and isolated, access might require being on a specific network segment or physically present in a control room. When a workload is cloud-hosted, access is often remote by design, which can be a benefit for support and centralized management, but it increases the importance of strong authentication and tightly scoped permissions. This is where Multi-Factor Authentication (M F A) becomes a meaningful concept, because it reduces the risk that a single stolen password grants broad access. But it is not just about adding another step, because in O T the access process must still work during stressful incidents and in environments with constrained connectivity. So identity systems must balance security strength with operational usability, including clear role definitions and emergency procedures that do not undermine control. Beginners should keep in mind that cloud moves often turn access into a centralized policy problem, and centralized policy can be powerful, but mistakes can scale quickly.

Vendor services are a special case that deserves attention because many O T cloud and edge projects are delivered as vendor-managed platforms. A vendor might provide an edge appliance that collects data and then sends it to the vendor’s cloud for analytics and dashboards. The vendor might provide remote monitoring, managed detection, or predictive maintenance as a service. These services can be attractive because they reduce the burden on local teams and provide specialized expertise. The security tradeoff is that you are extending trust beyond your organization and creating a dependency on the vendor’s controls, processes, and incident response. You also have to think about data ownership, data retention, and who can access the data inside the vendor environment. Even if a vendor is trustworthy, mistakes can happen, and an attacker might target the vendor because compromising one vendor platform could provide access to many customers. For beginners, the key is to treat vendor services as part of your environment’s risk surface, not as something “outside” security because it is managed by someone else.

Hybrid architectures often try to get the best of both worlds, but they also combine the risks of both worlds unless designed carefully. A common hybrid pattern is to keep time-sensitive functions local, use edge platforms to collect and preprocess data, and send selected data to a cloud platform for fleet-wide analytics, reporting, and long-term storage. This can reduce bandwidth demands and keep critical operations independent from external connectivity while still enabling centralized insights. The security challenge is that hybrid systems have more connections, more identities, and more data flows that must be understood and monitored. Data crossing boundaries can be transformed, cached, and duplicated, which complicates integrity and confidentiality decisions. It also complicates incident response because the investigation may need logs from local systems, edge devices, cloud services, and vendor platforms, and timing differences between those log sources can make correlation difficult. Beginners should not see hybrid as inherently safer or inherently riskier, but as more complex, and complexity requires disciplined design and clear ownership.

Edge devices introduce their own failure modes and security concerns because they often live in harsh environments and may not be maintained as frequently as central systems. An edge node might run continuously for long periods, and if it fills its storage or experiences hardware degradation, it can start dropping data or behaving unpredictably. If an edge device is compromised, it might become a stepping stone into the O T network, or it might manipulate the data it forwards, causing misleading analytics and poor decision-making. Physical access risks are also higher at the edge, because devices may be in cabinets, remote facilities, or shared spaces where control is weaker than in a secure data center. On the other hand, edge devices can be designed with strong isolation, limited network access, and clear update processes that reduce their risk, and they can reduce the need to open broader network pathways into critical segments. The important beginner lesson is that edge is not automatically safer because it is local, and cloud is not automatically unsafe because it is remote; security depends on design choices and operational discipline.

A beginner-friendly way to evaluate cloud and edge placement is to focus on data classification and control impact. If the workload handles highly sensitive information about the environment, such as detailed configurations, credentials, or process recipes, you need stronger safeguards and tighter access controls, and you may decide to keep certain data local or anonymize it before sending it out. If the workload can influence operations, such as issuing commands or changing configurations, you need strict boundaries, approvals, and strong authentication, and you may decide to keep control functions local while allowing cloud visibility functions. Many cloud O T deployments deliberately separate “monitoring outward” from “control inward,” meaning data can flow out to provide visibility, but control commands do not flow back in except through carefully governed channels. This separation reduces the chance that a cloud compromise turns into immediate process disruption. Beginners can think of it as a one-way window: it is safer to look out than to allow remote hands in. When remote control is necessary, it should be treated as a high-risk pathway that requires strong governance.

Recovery and resilience are the final major tradeoffs to consider, because placing workloads in cloud or edge changes how you recover from incidents. In cloud environments, you may have strong capabilities for backups, redundancy, and rapid redeployment, which can improve recovery speed for certain services. But you also need plans for provider outages, misconfigurations, and account-level incidents, because losing access to a cloud account can be as disruptive as losing a physical server room. In edge environments, recovery may depend on physical replacement, spare units, and site access, which can be slower, but edge also allows local continuity when external connectivity fails. Hybrid environments need clear plans for what happens when any one layer fails, including how long the site can operate safely without cloud visibility and how to restore data flow without causing instability. For beginners, it helps to remember that resilience is not only “can we rebuild it,” but “can we keep operating safely while we rebuild it,” and cloud and edge choices should support that goal.

As we wrap up, placing O T workloads in cloud and edge environments is best understood as an architectural decision that changes trust, timing, dependencies, and recovery paths. Public cloud offers scalability and managed services but adds external dependency and requires strong identity controls and careful data governance. Private cloud can offer cloud-like management with more direct organizational control but still concentrates risk into shared platforms and management planes. Hybrid models can balance local reliability with centralized insight, but they increase complexity and require clear ownership and strong visibility across boundaries. Edge computing can preserve local stability and reduce bandwidth and exposure, but it spreads assets across sites and raises physical and lifecycle management risks. Vendor services can reduce burden and add expertise, but they extend trust beyond the organization and can create supply chain and multi-tenant exposure. If you can explain these tradeoffs in terms of latency, availability, identity, data sensitivity, control impact, and failure modes, you will be able to evaluate cloud and edge placements in O T security in a way that protects operations first and avoids both fear and hype.

Episode 24 — Place OT Workloads in Cloud and Edge: Public, Private, Hybrid, and Vendor Services
Broadcast by