My Top 10 Principles for Cyber Risk Management

My Top 10 Principles for Cyber Risk Management

To truly understand a topic and build a plan to accomplish a goal, I’ve learned it’s best to start from first principles. This concept goes back to Aristotle and Euclid. In modern times people like Richard Feynman, Charlie Munger, Jeff Bezos, Reed Hastings, and Elon Musk have used it.

By way of background, I first got involved with Cyber Risk Management in the Spring of 2018 to help my clients justify upgrading their endpoint agents from legacy anti-virus to CrowdStrike or SentinelOne. While there was a clear improvement in malware detection capabilities, there was also an increase in cost.

The issue my clients faced was more than simply cost-justifying the price of a next-gen endpoint protection solution. They also needed a credible method of showing that this control delivered the best combination of cyber risk reduction and return on investment compared to alternative control investments.

I could not find a solution that systematically compares the efficacy, coverage, and governance of individual controls and then aggregates their overall value in the context of the cyber loss events of concern to business leaders.

At that time, I met Jim Lipkis who was giving a FAIR? training class. While FAIR was promising in theory, it did not include the essential cyber control modeling functionality to make it useful to the security teams I worked with.

For the next three years, Jim provided risk analysis services to my clients. In 2021, I joined Jim's company Monaco Risk.

I’m not a linguist, a philosopher, or a statistician. But I do know cybersecurity. So based on my 25 years of cybersecurity experience and the last six years of cyber risk management experience, research, and interaction with clients and colleagues, I’m ready to discuss my top ten principles of cyber risk management. Some are obvious, some not so much. I welcome constructive criticism. Please let me know if you have some first principles I overlooked.

My top ten principles are based on my view of the value cyber risk management must deliver to make it worth doing. It must bridge the cybersecurity metrics - business risk gap to help CISOs and their security teams:

  • Justify control investments to the business leaders who set cybersecurity budgets.
  • Report reductions and increases in cyber risk to business leaders in dollars.
  • Obtain cooperation from IT, network, and software development teams who often implement exposure remediations requested by the cybersecurity team.
  • Prioritize control investments. No organization can afford to implement all industry recommendations. Given a limited budget, which control investments are going to provide the biggest risk reduction? Which control investments provide the best Return on Investment???
  • Resolve the tension between compliance and security. Minimize spending on controls needed for compliance requirements that provide little risk reduction.

1. The goal of cybersecurity is to reduce the probability of material impact due to cyber loss events

I start with the Ultimate First Principle. This is a term Rick Howard used in his book, “Cybersecurity First Principles.” He comes to this conclusion after 30 pages of why so many of our ideas about cybersecurity are not first principles.

His Ultimate First Principle (page 39) sets the goal of any cybersecurity program, “Reduce the probability of material impact due to a cyber event over the next three years.

Rick says “A material issue can have a major impact on the financial, economic, reputational, and legal aspects” of an organization. This is a bit different from the SEC’s definition of materiality, but the two are closely related.

Once you are talking about the probability of an event happening in the future you are in the realm of risk. Business leaders understand that cyber risk is business risk. Since business leaders set the cybersecurity budget explicitly and risk appetite, too often implicitly, it’s up to CISOs to communicate with them in their language – dollars.

Therefore, to collaborate with business leaders, express the value of cyber control investments as reductions in the probability of material impact due to cyber loss events.

Due to the SEC’s July 2023 cybersecurity rule, the term materiality and how to determine if a cybersecurity loss event is material has been heavily discussed. Business leaders should make that determination. And CISOs need to provide the risk analysis to advise them.

Here is my slightly amended Ultimate First Principle to make it more operational.

Reduce the probability of material financial impact due to cyber events to an acceptable level set by business leadership, for an agreed-upon time interval.

First, I added the word financial to emphasize the importance of expressing business impact in dollars.

Second, multiple types of cyber events, with different frequencies and severities, can occur within any given time interval. Therefore, they must be analyzed together.

Third, as Rick acknowledges, three years is not always the right time frame.

2. There is an inverse relationship between the likelihood and financial impact of a cyber loss event

The research on the financial impact of cybersecurity loss events over decades shows that as the dollar value increases, its probability decreases. All insurance companies, including cyber insurance companies, use this insight to price insurance coverage.

This is why the output of the risk assessment process must be Loss Exceedance Curves. LECs visualize the probability of financial losses exceeding a range of dollar values over a stated period of time. We can apply them to a single loss event scenario or collection of loss event scenarios.?

We compare the baseline (status quo) LEC to LECs representing alternative combinations of control investments to help align the cybersecurity budget with the level of cyber risk business leaders are comfortable with.

This is an important concept that risk matrices (heatmaps) get wrong and why they can be misleading. See my LinkedIn article for more information: https://www.dhirubhai.net/pulse/modeling-cybersecurity-bill-frank-fwwjf/

3. Different organizations will prioritize Loss Event Types differently

At Monaco Risk, we have documented 16 types of cyber loss events in our Loss Event Taxonomy. Deciding which loss event types should be prioritized depends on a variety of factors including the organization’s industry, goals, regulations, products and services, and customers.

For any given organization, only a few of Loss Event Types require rigorous quantitative analysis. I am not saying to ignore the others. But we can adequately evaluate them using qualitative methods or “back-of-the-envelope” quantitative analysis.

Interestingly, the types of loss events associated with Generative AI have not required us to add any new Loss Event Types.

We’ve made our Loss Event Taxonomy available to anyone under a Creative Commons License CC BY-ND 4.0. Let me know in the Comments section if you would like a copy.

4. Cyber Loss Event Types provide scope for risk analysis

It’s well understood that any type of modeling is only as good as the underlying assumptions. In the case of cybersecurity risk analysis, the key underlying assumptions are the loss events for which the analysis is performed. I’ve seen too many suboptimal decisions made about control investments without the context of the loss events.

Furthermore, as Rick Howard says, if you are spending resources that don’t have a direct impact on reducing the probability of material impact due to cyber loss events, you’re wasting resources.?

5. All cyber risk management decisions turn on controls

Risk is a function of the probability of a loss event occurring and its probable financial impact. This definition of risk has been used in cyber risk management for almost two decades and for hundreds of years in other domains.

Once the Loss Event Types of concern to leadership are identified, all risk management decisions (strategic and tactical) turn on controls. Controls are any people, processes, or technologies that the organization has control over that affect risk. Put another way, control effectiveness directly affects the probability of a cyber loss event occurring and its financial impact.

There has been a problem haunting cyber risk analysis and limiting its usefulness and therefore its acceptance. The problem has been the inability to model control effectiveness in a systematic way.

This is the problem Monaco Risk set out to solve – how best to model control effectiveness so that loss event likelihood and impact can be calculated with enough accuracy to be useful to security teams and credible to business leaders who set cybersecurity budgets and cyber risk appetites.

Note there are two probabilities to consider in the definition of risk. One is the probability of a loss event occurring. This is binary. Either the event occurs or it does not.

The other is the probability of the financial impact of the loss amount. This is not binary. It has a size which is subject to when the event is discovered, contained, and recovered from. Therefore, this is the probability of exceedance. Hence the value of using Loss Exceedance Curves to show the full range of losses at different probabilities.

6. The Kill Graph replaces the kill chain and the defense matrix

Attackers must execute a series of actions to achieve their goals. Therefore, defenders have multiple opportunities to prevent a loss event by deploying controls that mitigate the attackers' actions. As mentioned above, controls can be any combination of people, processes, and technologies.

This is almost the opposite of the often-repeated cliche that defenders must be right 100% of the time and attackers need be right only once.

In 2010, Lockheed Martin coined the term Cyber Kill Chain?. In 2017, Sounil Yu and I independently developed a cybersecurity matrix approach leveraging the NIST Cybersecurity Framework.

In 2018, MITRE ATT&CK? v1.0 was released. It was a breakthrough for the cybersecurity industry because it provided standardized language to discuss the tactics and techniques used by adversaries to accomplish their nefarious objectives, and the mitigations available to defenders.

At Monaco Risk we realized that the complexity of cybersecurity outstripped the modeling power of kill chains, event trees, and matrices. Analyzing individual kill chains independently of one another leads to suboptimal decisions. Organizations have thousands to tens of thousands of overlapping and interleaved paths (chains) into and through their IT/OT estates. All of them need to be analyzed together using graph modeling techniques to optimize control investment decision-making.

Nor do matrices have the expressive power to adequately represent the relationship between the flow of threats and the controls that are designed to block them, or at least alert on suspicious behavior.?

Event Tree Analysis does not work for cyber risk either. While I understand that event tree analysis is effective in other domains, cybersecurity’s combination of dozens of possible controls, hundreds of threat types, and thousands of attack paths is too complex. A graph-based approach is needed where controls can be mapped to the overlapping and interleaved attack paths through an organization’s IT/OT estate.

Thus the need for the Kill Graph that leverages MITRE ATT&CK. Monaco Risk instantiated this principle with its patented software, the Cyber Defense Graph?. I have written about it elsewhere on LinkedIn.

While controversial today, I believe the kill graph will become universally accepted as a first principle of cyber risk management because it makes it useful to CISOs and their teams.

To make the kill graph approach operational, we found that the following factors are needed in the model – threats, vulnerabilities, attack surfaces, attack paths, and control effectiveness. Furthermore, since risk reduction turns on control selection decision-making, control effectiveness can be further decomposed to model capability, coverage, and governance.

Modeling these factors increases the accuracy of the cyber risk models and improves their usefulness to security teams and credibility to business leaders.

Remember, this is not for all cyber risks. We only go to this level of detail for loss events with a probability of material impact on the organization’s goals.

7. Analyze Loss Event Types collectively

Loss Event Types should not be analyzed independently of one another when there are common controls involved. Only by analyzing Loss Event Types collectively can you determine the true risk reduction value of common controls.

For example, business disruption due to ransomware and exfiltration of sensitive information are two different Loss Event Types. But when modeling them using our Cyber Defense Graph, you see that they share many controls, such as email security, MFA, and endpoint security. There are also controls specifically focused on one or the other Loss Event Type such as encryption detection for the former and DLP for the latter.

8. There are two types of controls – Direct and Indirect

Direct Controls block threats or at least alert on suspicious activity. Examples include Endpoint Agents, Firewalls, Multifactor Authentication, and Patching. Direct Controls relate to MITRE ATT&CK? Mitigations.

Indirect Controls support, monitor, or improve the performance of Direct Controls. Examples include asset discovery, threat modeling, posture management, vulnerability scanning, and penetration testing. I would also include audit as an indirect control.

There are several types of Indirect Controls. But for purpose of cyber risk management, I want to focus on Performance Controls which monitor and measure the performance, i.e., the effectiveness, of Direct Controls.

The findings and metrics generated by Indirect Performance Controls increase the accuracy of the control input parameters of cyber risk analysis software. This increases the model’s credibility. ?

Given the number of direct controls organizations deploy and the complexity of infrastructure and applications, the use of automated indirect controls is increasing.

If we think of cyber risk analysis as a distinct type of control rather than just another indirect control, then it would be the third layer on top of the Indirect Controls layer which sits atop the Direct Controls layer. In a previous article I called these three layers taken together the Cybersecurity 3-Layer CAKE (Control Analytics, Knowledge, and Evaluation).

9. Control Effectiveness is not the same as risk reduction

A Direct Control’s technical effectiveness when evaluated in isolation may not translate to a commensurate risk reduction when added to an organization's portfolio of deployed controls.

This has to do with the distribution of threats across attack paths and the strength of the other deployed controls. A strong control will not reduce risk significantly if it does not see many threats or is on a path with other strong controls.

10. Move from compliance-based risk to risk-based compliance

Most cybersecurity teams use one or more compliance frameworks to drive their programs. To a greater or lesser degree, risk management is typically one of a long list of itemized requirements.

Compliance-based cyber risk management is our term for a risk management process that’s adequate for ISO 27001 certification and SOC 2 accreditation but does not help security teams prioritize control investments or collaborate with business leaders who set cybersecurity budgets.

Cyber risk-based compliance management is our term for using risk management to drive the cybersecurity program while also meeting compliance requirements. The result is that the budget is allocated across compliance requirements to optimize risk reduction. ?

The lists of compliance framework requirements are the “what we need to do” activities. Risk management addresses the “how to implement” trade-offs that must be made due to limited budgets and resources.

Risk management centered on the probability of material impact due to loss events aligns security teams with business leaders who focus on protecting revenue-generating business processes, critical assets, capital, and cash flow.

Finally, we find that some controls needed for compliance don’t significantly reduce risk. For these controls select based on price. In other words, select lower cost controls needed for compliance whose contribution to risk reduction is low. This conserves budget for controls whose contributions to risk reduction are high.

Bill Frank

I help CISOs justify proposed control investments by translating their cyber posture improvements to reductions in the probability of material financial impact due to loss events.

1 个月

Based on additional feedback, I have revised this article again today. I removed the terms "Quantification" and "CRQ." In fact, I am no longer going to use these terms for two reasons. First, it's not meaningful. It does not actually characterize a systematic process that connects security control enhancements or additions to a reduction in the probability of material financial impact due to cyber loss events. I've seen too many offerings use "quantification" to describe ordinal scales like A, B, or C, or improvements in security posture by some percentage. Second, and more importantly, cyber risk analysis is not worth doing unless it improves control investment decision-making to reduce the probability of material financial impact due to cyber loss events. Cyber risk analysis need not be a long, expensive process. An advisory service, like Monaco Risk, can show results in weeks with a minimal number of hours of your time. As always, feedback and/or questions are welcome.

回复
Bill Frank

I help CISOs justify proposed control investments by translating their cyber posture improvements to reductions in the probability of material financial impact due to loss events.

2 个月

Thanks to the feedback I've received during the last two weeks, I've revised and republished this post and article. I'm sure this won't be the last revision. Please continue to provide feedback.

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了