Rethinking Mean Time to Remediation (MTTR) in vulnerability management

Rethinking Mean Time to Remediation (MTTR) in vulnerability management

Originally published April 13, 2023.

Which of the below quotations (anonymized but real) from players in the vulnerability management space is correct?

  • “Mean Time to Remediate (MTTR) is important.”
  • MTTR is one of the “two of the most important metrics to use when judging your security program’s true maturity.”
  • MTTR is an “incomplete metric” and you should use MOVA (Mean Open Vulnerability Age).

The answer?

None of them.

Why you shouldn’t worry about MTTR

I was recently on the show “Security Beyond the Checkbox” with Jason Firch of PurpleSec, during which we discussed those metrics which were not important from a security perspective. I mentioned MTTR was one of them, and gave a brief explanation why. After thinking about it a little more, I decided that a more detailed breakdown of my logic would be appropriate.

Since security teams often obsess about metrics, choosing the correct ones is extremely important. Mistaking a glimmer in the night for a lighthouse can easily crash the figurative vessel of a vulnerability management program on the rocks of despair.

The key problem with the MTTR metric is the first letter in the acronym: mean.

The metric necessarily implies that all vulnerabilities are created equally from a security perspective, when we know this is most certainly not true. A handful of highly dangerous vulnerabilities lead to billions of dollars of losses while the vast majority are never exploited in the wild.

Thus, prioritization is the single most important task in vulnerability management. An organization that blindly burns down a backlog of vulnerabilities in pursuit of hitting a given MTTR will not be able to prioritize effectively because each vulnerability is weighted equivalently by the metric. This does not reflect what we know to be the security reality.

While advocates of using MTTR are no doubt well-intentioned, like adherents of CVSS, I believe them to be quite mistaken. In the below example, I’ll show exactly why.

Before diving in, please note my skepticism about MTTR is specifically focused on vulnerability management, not security operations or any other discipline.

The key role of vulnerability prioritization

Assume you have two organizations that are identical except in one respect: organization A has an MTTR for all vulnerabilities of 30 days while Organization B has an MTTR of 60 days. Since they are otherwise the same, the first organization will be more secure, all other things being equal.

But slightly modifying the situation (and making it more realistic), it is quite possible that Organization B would have a far lower risk of malicious exploitation of known vulnerabilities in its network.

In this second case, assume the only additional difference between organizations is in their remediation policies, which they follow exactly.

  • Organization A has a policy of “fix all high and critical vulnerabilities (i.e. CVSS 7+) 15 days after detection and everything else 50 days afterwards.”
  • Organization B has a policy of “fix all vulnerabilities with an Exploit Prediction Scoring System (EPSS) score or equivalent (for non CVEs) over 0.0176 five days after detection, and everything else 70 days afterwards.”

Organization A: the CVSS approach

Based on their CVSS 7+ strategy, the first organization will remediate ~82% (see this paper for the source of the figure) of vulnerabilities ever known to have been exploited in 15 days and the remaining ~18% in 50 days. CVSS 7+ issues account for 58.1% of all CVEs (see the aforementioned post for details) and I’ll assume that this 58% figure is comparable for all published flaws (in addition to CVEs), which seems reasonable since most publicly-known issues are CVEs.

Organization A’s MTTR would thus be:

58.1% (CVSS > 7 vulnerabilities) x 15 days + 41.90% (CVSS < 7 vulnerabilities) x 50 days = ~30 days        

Organization B: leveraging the EPSS

To remediate the same percentage of known exploited vulnerabilities as when using CVSS (~82%), Organization B needs to use an EPSS threshold of 0.088 (again, please see the EPSS paper for details), which accounts for 7.3% of all CVEs (and all known vulnerabilities, using our assumptions).

Organization B, however, cuts the 0.088 threshold by a factor of five just to be safe and thus resolves substantially more exploitable issues than organization A and in a faster period of time. As of April 10, 2023, a 0.0176 EPSS threshold would require remediating 13.9% of all vulnerabilities in five days. Since at most 10% of CVEs (and thus known vulnerabilities, per our assumptions) are exploitable in any given configuration, the 13.9% figure implies that there is a very low likelihood of any remaining vulnerabilities being exploited.

Organization B’s MTTR would thus be:

13.9% (EPSS > 0.0176) x 5 days + 85.40% (EPSS < 0.0176) x 70 days = ~61 days        

Despite having more than double the MTTR, Organization B is resolving many more exploitable vulnerabilities much faster than Organization A. Since their networks are otherwise identical, we can assume the distribution of known vulnerabilities among assets is equivalent. Thus, Organization B’s cybersecurity risk from known vulnerabilities is substantially smaller.

Conclusion

The above example shows why choosing the right metrics rather than blindly following the pack is so critical in vulnerability management. It is extremely easy to think you are succeeding when you are not, if you pick the wrong key performance indicators. With that said, I think it’s important that I reinforce my lack of animus toward those who use MTTR as a metric, as I consider them merely misguided rather than malicious. An important step toward fixing this issue, though, is highlighting it.

So if MTTR isn’t the right metric, what is?

Since cybersecurity is, at its core, another business risk management function, maintaining the organization’s desired risk surface is what matters.

This topic itself is worthy of an entirely separate post, but suffice to say that hitting organizational risk appetite and tolerance targets are what security professionals, executives, and board members should all focus on. Describing these things in financial terms ensures a common language across all business functions.

要查看或添加评论,请登录

StackAware的更多文章

社区洞察

其他会员也浏览了