Metrics are the benchmark against which organizations measure success—or what they look to when determining whether a product, service or strategy is working. But leaning too heavily on metrics could ultimately prove dangerous. Not all metrics are created equal. Some can be misinterpreted, while others are misused or manipulated. And some analytics aren’t metrics at all.
At RSAC, attendees learned how to separate the good metrics from the bad, as well as what to look for in measuring the performance of security operations teams. Here’s a look at what industry experts recommended for understanding, gathering and applying metrics.
Reporting vs. metrics
In their 40 years of combined experience, Kerry Matre, Senior Director for Mandiant Services at FireEye, and John Caimano, Global Practice Lead for Security Operations at Palo Alto Networks, noticed that in security operations, there are good metrics and bad metrics. What people often think of as metrics are really reporting features—data that tells you about activity but doesn’t result in meaningful action.
Reporting typically describes the number or amount of something: the number of incidents handled, the number of analysts working, the total amount of hours worked that week. Metrics, on the other hand, demonstrate whether your SOC is successful and provide the organization with confidence in its services. Conversely, metrics can also show what isn’t working in the SOC and which actions to take next. Good metrics serve as a catalyst for change to better protect the organization.
“It’s supremely important to collect the right metrics; otherwise, you’ll end up with a mess inside your network and operations,” said Caimano in “Your Metrics Suck! 5 SecOps Metrics That Are Better Than MTTR” at RSAC. “If you mess up in security operations, you’re talking about a breach—some kind of ransoming, destruction or exfiltration of data.”
Metrics are hard to gather and sometimes lead to more questions than answers. At times, people are afraid of metrics—of the story—they’ll uncover. But organizations need to know the truth to be successful. With the right metrics, you can not only tell the most accurate story about your SOC but also provide your organization with confidence about its trajectory—or drive the necessary change to bring the business back on track.
Why MTTR is the wrong metric
MTTR (Mean Time To Respond) is probably one of the most famous metrics in cybersecurity. It’s in every Verizon DBIR report and is featured in the dashboards and reports of nearly every security product. MTTR tells how much time elapsed between the moment a security incident took place and when it was discovered, investigated and contained. A low MTTR has become synonymous with the efficacy of security teams and their tools.
While MTTR is useful for reporting and assessing whether automation is working, it’s not a good metric by which to measure SOC analysts or engineers. It drives the wrong behavior, pushing analysts to skip through events, make knee-jerk decisions and rapidly resolve issues without completing a full analysis or bringing what they learned back into the controls.
“What MTTR doesn’t tell you is anything about security operations,” said Caimano. “Resolving tickets takes time, especially when you want those high fidelity, true positives and true false positives. You need to allow your analysts to build a story properly: define the who, what, when, where, how and why. If anything is missing, you wind up with lower confidence in your results, which can lead to a compromise clinging onto a foothold, and a foothold becoming a breach in short order.”
Caimano and Matre told RSAC attendees that analysts need to investigate tickets without a ticking time clock. When MTTR is allowed inside security operations centers, analysts often cherry-pick tickets to stay on top of the leaderboard, gravitating toward tasks they already know how to complete. A low MTTR time ultimately does more harm than good, rushing analysts into sloppy work and stifling robust learning in the SOC.
The right stuff: find your metrics
If MTTR is the wrong metric, then which is the right one? Considering what a business wants from its SOC, good metrics should form up around two ideas: configuration confidence and operational confidence.
Configuration confidence is knowing the right tools are in place, configured to best practice, that can prevent a breach, detect an ongoing attack or provide enough intelligence to an analyst to mitigate any breach that occurs. Essentially, it means having the right tools and automation in place.
Operational confidence is having the right people and processes in place to use those tools in the event of a breach or when analysis and investigation are necessary. Operational confidence happens when the right staff with the right training and the right procedures are deployed to the right places.
To figure out which metrics are right for your security teams, Caimano and Matre recommend zeroing in on five categories: analyst activity, hygiene, realized value, process deviation and analyst work distribution.
For measuring analyst activity with better accuracy than MTTR, there are two key metrics: Events Per Analyst Hour (EPAH) and handling time per alert per stage per analyst. EPAH can help organizations determine staffing hours: 100 events per hour means analysts are overstrained, whereas 10 events per hour means the SOC is well staffed. Meanwhile, monitoring handling time per alert per stage per analyst shows analysts’ strengths and weaknesses, as well as where process and visibility are well established vs. where they need improvement.
Other important metrics for SOC teams involve hygiene, aka efficiency. These include tracking unused rules and duplicate rules, which can help lower administrative overhead and guide managers on fine-tuning tools during a tech refresh.
Additional important metrics report on consistency in alert handling, such as whether steps in any phases are missed or if automation fails (process deviation). They also show how many tech features are being used across the enterprise, whether there’s software that isn’t getting deployed and if threat intelligence is turning into actionable knowledge (realized value). Of course, metrics related to analyst work distribution help to evenly distribute work across the SOC.
Metrics as validation
In “MDR: Making Sense of the Veg-O-Matic Buzzword Blender” at RSAC, Lisa Lee, Chief Security Advisor/Global SCI Industry Lead at Microsoft, said, “A lot of people like to talk about ROI, but maybe we need to think about the value of the service and the return on the service.” When viewed in this light, it’s easier to see why the right metrics are necessary to validate the service the SOC—or any IT or security team—provides to the business.
Businesses can rely on metrics for just about any course of action: determining KPIs and goals, optimizing operations workflows, driving increases in SEO, website traffic and revenue. However, if stakeholders ignore metrics in favor of personal preference or focus on the wrong metrics for their teams, they miss out on the single most valuable opportunity to evaluate the business. By opting for metrics that matter, CISOs, CIOs, CEOs—any of the organization’s leaders—will have confidence in the direction of their business for years to come.