The Role of Data in Threat Modeling

When it comes to stopping malicious actors from widespread damage, we have to catch the bad guy, and we need to do it fast. The longer a threat actor has to operate, the more damage will be done. The problem, of course, is that we have only so much telemetry from only so many systems. How do we separate the signal from the noise and reduce the time it takes to detect, investigate, respond, and recover from a cyber threat? How can we minimize the impact of an incident while also minimizing the time and manpower it takes to do it?

The modern digital landscape expands the threat surface to unprecedented levels of complexity. Organizations today must manage a plethora of endpoints, from traditional servers and workstations to mobile devices and IoT devices, each potentially serving as an entry, escalation, or lateral movement point for cyber threats. Migrating business operations and sensitive data to cloud environments introduce new vectors of attack, and the increasing adoption of interconnected systems — often through APIs and third-party integrations — creates additional entry points and potential attack surfaces.

Threat modeling is a proactive and systematic approach used in cybersecurity to identify potential vulnerabilities and security risks within a system, application, or network. It involves analyzing the architecture, design, and components of the target system to understand how attackers might exploit weaknesses. CISA recommends manufacturers “use a tailored threat model during the product development stage to address all potential threats to a system and account for each system’s deployment process.” By effectively predicting potential threats and understanding their impact, threat modeling enables organizations to prioritize security measures, implement appropriate defenses, and build robust systems that are better equipped to withstand and deter cyber threats.

But what if we thought of threat modeling in a slightly different way? By analyzing patterns, behaviors, and relationships extracted from metadata generated by various components and activities within a system, data-focused threat modeling can help defenders identify emerging threats in new ways, and potentially reduce the time it takes to detect advanced threats.

Know Your Metadata Streams

Key metadata streams play an important role in data-focused threat modeling. These include (but are not limited to):

Content analysis. Threat actors target valuable data, so understanding that value is paramount. If you don’t know where all your sensitive information is located, whether it’s regulated data or digital secrets, it’s hard to monitor the target of an attack.
Data location and hierarchy. Mapping key data stores like file systems, object stores, databases, and data lakes is critical to understand the target of an attack.

Data access logging. It’s hard to model an attack without monitoring the target, which is data.
Authentication behavior in directories that contain the entitlements to various systems and data a threat actor will target.
API activity, which is often used in more advanced threat scenarios.

How Metadata can Improve Threat Modeling

Metadata-driven insights provide valuable information on potential attack vectors, unusual user behaviors, or suspicious network activities, enabling security teams to take proactive measures, enhance incident response, and strengthen overall cyber defenses. This approach empowers organizations to stay one step ahead of cyber adversaries and foster a proactive security posture in the dynamic and ever-evolving landscape of cyber threats.

Threat modeling continues to evolve alongside remarkable advancements, including the integration of more sophisticated techniques in cybersecurity practices. Traditionally, threat modeling relied on manual analysis and static models. For instance, triggering on single events like specific escalations of privilege, such as a user getting added to an admin group. Similarly, using static thresholds was valuable in detecting unsophisticated attacks, like brute-force authentication or simple ransomware, in which a high number of events in a short period of time could indicate a threat in progress.

More advanced techniques involved building behavioral baselines for activity and identifying relevant deviations. For instance, if you measure user activity across a large set of data, you can identify when that account starts accessing data in new locations or of new types. If an account starts accessing data untouched by that account in, say, 180 days, especially when combined with device or location analysis, it could be an indicator that the account has been taken over by an attacker or that an insider is breaking bad, potentially gathering everything they can before walking across the street to a competitor.

With the emergence of new technologies, modern data-focused threat modeling has transformed into a dynamic and automated process; it incorporates machine learning, artificial intelligence, and other techniques to analyze vast amounts of information, identify complex threat patterns, and predict emerging cyber risks in real time. If you have the right metadata, data-focused threat modeling can be a key technique in reducing the time to detection of more advanced insider threats and outside attacks.

Jen Easterly: Lessons from an AI Model That Went Off-Script. Read Now.

The Role of Data in Threat Modeling

Contributors

Brian Vecci