Episode 67 — Turn Telemetry Into Intelligence: Logs, Sessions, and Anomalies That Matter
When people first hear telemetry, they often picture a firehose of data pouring out of machines, and they assume more data automatically means more security. In Operational Technology (O T), telemetry is valuable, but only if it turns into understanding that supports safe decisions. The reality is that O T environments are full of signals, yet many of them are either not collected, not retained, or not interpreted in a security-relevant way because uptime and stability have historically been the priority. Turning telemetry into intelligence means learning which signals matter, what they mean in context, and how to connect them into a story about risk. That story could be about an attacker, but it could also be about a misconfiguration, a failing device, or an operational change that creates a new exposure. For brand-new learners, the key shift is moving from raw events to meaningful patterns, because a single log line is rarely decisive on its own. What matters is the relationship between logs, sessions, timing, and deviations from normal behavior. In this lesson, we will focus on three categories that are especially practical for O T: logs that describe events, sessions that describe access and control, and anomalies that describe what is different from baseline.
Before we continue, a quick note: this audio course is a companion to our course companion books. The first book is about the exam and provides detailed information on how to pass it best. The second book is a Kindle-only eBook that contains 1,000 flashcards that can be used on your mobile device or Kindle. Check them both out at Cyber Author dot me, in the Bare Metal Study Guides Series.
To build a foundation, it helps to define telemetry in plain terms: telemetry is any recorded observation about how systems and networks behave over time. In O T, telemetry might come from a firewall at the boundary, a remote access gateway, a Windows event log on an engineering workstation, a historian server, a switch mirror port feeding a monitoring sensor, or even operational records like maintenance tickets and change approvals. Beginners sometimes think telemetry must be “security logs,” but in O T, some of the most valuable clues come from operational sources, because the question is often whether something changed that should not have changed. For example, a record that a controller logic change was approved on Tuesday becomes a security clue on Wednesday if the controller changed again without any approval. Telemetry also includes what did not happen, such as the absence of expected heartbeat traffic or the absence of scheduled remote support sessions. The purpose is not to collect everything; the purpose is to capture enough of the right signals to know what is happening and to detect meaningful deviation. In O T, meaningful deviation is often about access, configuration, and process visibility, because those are the pathways to direct or indirect impact.
Logs are the most familiar form of telemetry, but to use them well you need to understand what a log can and cannot tell you. A log is typically an event record produced by a system, describing something that happened, often with a timestamp, a source, and some metadata. In O T, high-value logs often include authentication events, privilege changes, software installation events, service start and stop events, and configuration changes on systems that control or mediate access. Network logs, such as firewall accept and deny records, can show who talked to whom, when, and how often, which is essential when you want to detect unexpected paths between I T and O T zones. Device logs, when available, can reveal restarts, firmware updates, or unexpected errors that might reflect tampering or instability. The beginner trap is to treat logs as perfect truth, when logs are actually partial truth: they show what the system chose to record, and attackers may try to avoid logging, disable logging, or blend into normal log patterns. Even without an attacker, logs may be incomplete because of retention limits or configuration gaps. So turning logs into intelligence means you combine them with context and you look for clusters of related events that form a plausible narrative.
A practical way to start turning logs into intelligence is to focus on a small set of questions that matter almost everywhere. Who accessed what, from where, and when? What changed, who initiated the change, and was it authorized? What new software or services appeared, and are they expected? What connections occurred between zones that are normally separate? These questions sound simple, but they map to real operational risk. A remote login to an engineering workstation at 2 a.m. might be normal if there is a planned maintenance window, but suspicious if there was no approved work. A firewall rule change might be normal if it was part of a documented project, but suspicious if it coincides with new outbound connections from a system that should not communicate externally. The intelligence is not the event; it is the interpretation in context, and the context often lives outside the log itself. Beginners become much more effective when they learn to pair logs with non-technical context like maintenance schedules, vendor support tickets, and change management records. That pairing transforms noise into meaning, and it also reduces false alarms that could waste time or disrupt operations.
Sessions are a special kind of telemetry that deserves separate attention because sessions represent active use of access, not just single events. A session could be a remote desktop connection, a virtual private network connection, a vendor portal connection, or any authenticated interactive connection where someone can make changes. In O T, sessions matter because the most damaging actions often require interactive access: exploring the network, transferring engineering files, changing setpoints, or uploading new logic. Session telemetry can include who authenticated, the source location, the duration, the systems accessed, and sometimes the actions taken during the session. Beginners should learn that session data helps you answer a deeper question than simple logs: it helps you understand the intent and scope of access. For example, a successful login event tells you someone got in, but a session record tells you they stayed connected for two hours and touched three different systems, which is a much stronger signal. Session patterns also help distinguish normal operational behavior from suspicious behavior, because many O T environments have predictable session rhythms, like vendor connections during business hours or scheduled maintenance windows. When sessions deviate from those rhythms, it is often worth attention.
Turning session telemetry into intelligence requires understanding what “normal” looks like for your remote access ecosystem and being careful about exceptions. Normal might include specific vendors connecting through specific pathways, specific internal teams connecting from specific locations, and specific time windows when changes are allowed. If you see a session from an unusual geography, an unusual device, or an unusual time, you should ask whether it matches an approved reason. If you see multiple failed logins followed by a successful one, that could indicate credential guessing or a user struggling with a password, and the difference matters. If you see a session that jumps from a remote access gateway to a sensitive workstation that is rarely accessed, that suggests lateral movement or an unusual maintenance need. In O T, session telemetry is also a safety tool because it provides accountability; if you can trace changes to specific sessions, you can more confidently distinguish authorized activity from suspicious activity. The goal is not to assume every unusual session is malicious, but to treat session deviations as prompts for verification. Verification keeps the environment safe without creating unnecessary disruption.
Anomalies are where telemetry becomes truly powerful, but anomalies are also where beginners can get overwhelmed because anomaly detection can sound like advanced mathematics or artificial intelligence. At a basic level, an anomaly is simply something that differs from a baseline expectation. In O T, baselines can be strong because many networks are stable, but you still have to define what stable means. Baselines might include normal communication paths between systems, normal frequencies of certain protocol traffic, normal workstation behavior during engineering work, and normal times when systems are updated or rebooted. An anomaly might be a new communication path between zones, a sudden increase in traffic to a controller network, or a device that starts speaking a protocol it never used before. It could also be a quiet anomaly, like a workstation that stops sending its usual logs or a controller that stops reporting expected status values. The key beginner lesson is that anomalies are not automatically threats; they are questions. Your job is to figure out whether the anomaly is explained by legitimate operations, by a benign fault, or by malicious activity.
To make anomalies useful, you need to connect them to risk, which means prioritizing anomalies that could plausibly lead to operational harm. In O T, anomalies related to access are often high priority because access is the gateway to control. Anomalies related to configuration changes are high priority because configuration changes can alter process behavior or safety margins. Anomalies related to segmentation boundaries are high priority because they may indicate a pivot from I T into O T. Anomalies related to process control integrity can be the highest priority, such as unexpected program downloads or unexpected changes in control parameters, because these can directly affect physical outcomes. On the other hand, anomalies like a one-time network spike might be less concerning if it aligns with a known backup job or a planned update. Beginners should learn to treat anomaly triage as a context exercise: you ask, what is the asset, what is its role, what is the expected behavior, and what could happen if this anomaly represents malicious activity. When you do that, you avoid drowning in alerts and focus on the anomalies that matter most to safety and continuity.
One of the most important ways to turn telemetry into intelligence is correlation, which means connecting multiple weak signals into a stronger story. A single failed login might be nothing, but a burst of failed logins followed by a successful session from an unusual location is more meaningful. A single firewall allow might be normal, but a new allow rule followed by outbound connections from a system that usually does not talk externally is more meaningful. A single workstation reboot might be routine, but a reboot combined with new services starting and new scheduled tasks appearing is more meaningful. Correlation is especially valuable in O T because you may not have dense telemetry everywhere, so you have to make the most of what you do have. This is also where the idea of “kill chain thinking” becomes practical, even if you never say the phrase out loud: attackers often have to do multiple things in sequence, and the sequence creates patterns. Telemetry becomes intelligence when it helps you detect sequences, not just isolated events. For beginners, learning to look for sequences is a major step forward.
It is also helpful to understand that the absence of telemetry can itself be a signal, especially in environments where you expect regular reporting. If a log source goes silent, it could be a benign network issue, but it could also be a sign that something was disabled or disrupted. If a remote access gateway stops producing session logs, that is a serious problem because it removes accountability and visibility. If a monitoring sensor stops seeing traffic that should always be present, that could indicate a communication disruption that affects operational safety. Beginners sometimes focus only on what is present, but security maturity often comes from noticing when visibility disappears. In O T, loss of visibility can force conservative decisions like shutting down or switching to manual control, because you cannot safely operate what you cannot observe. So part of turning telemetry into intelligence is ensuring telemetry continuity: knowing what sources should be present and having alerts when they vanish. This is not glamorous, but it is foundational to safety-driven security.
Another O T-specific point is that telemetry interpretation must respect operational reality, because industrial systems can behave in ways that look suspicious to someone trained only on enterprise I T. Some protocols are chatty, some devices broadcast regularly, and some engineering operations legitimately involve bursts of unusual activity. That means you need collaboration between security and operations to interpret anomalies correctly. However, collaboration does not mean accepting every explanation at face value; it means validating explanations against schedules, approvals, and known workflows. If someone says, “That was vendor work,” you should be able to find the record of the vendor work and the approved window. If someone says, “That reboot was normal,” you should be able to connect it to a maintenance action or a known fault. This discipline turns tribal knowledge into verifiable knowledge, which is critical when incidents are stressful and memories are unreliable. For beginners, the learning is that good security questions are respectful and precise, not accusatory, and that verification protects both people and systems. Telemetry becomes intelligence when it supports that verification process.
When you step back, turning telemetry into intelligence is a practice of selecting signals, building baselines, watching for meaningful deviation, and then interpreting what you see in a way that supports safe action. Logs give you event detail, sessions give you access context, and anomalies point you toward what needs explanation. The intelligence is the story you can tell with evidence: who accessed what, what changed, what was expected, and what risk remains. In O T, that story must be tied to physical consequences, because the purpose is not simply to “catch bad guys” but to keep operations safe and reliable. You do not need perfect telemetry to begin; you need a disciplined approach to the telemetry you have and a plan to improve visibility over time. As a new learner, the most valuable habit you can develop is to ask, for every signal, what decision it could support and what additional context would make it clearer. When you do that consistently, you transform raw data into practical intelligence, and that is one of the most empowering skills in O T security.