Every day there are news related to companies and public organizations that have suffered a data breach due to an external attack, human error, or negligent actions on the part of employees or former employees. We see that around these news there is data that the organization that has suffered the breach is exposed to losses of X hundreds of thousands of dollars. But how is the Data Breach loss cost estimate obtained?
Table of contents:
- Introduction – The cost of a data breach
- Different strategies to quantify the cost
- Quantification based on the cost of the activity
- The FAIR methodology to quantify the cost
- A case study with the FAIR methodology
- Analysis using the Open FAIR tool
- The ROI of applying data-centric security
The cost of a data breach
Each year, IBM publishes its Cost of a Data Breach Report, where, based on analyzed data from companies and organizations in different sectors, it estimates the cost of a data breach per record. It also analyzes data breach trends and the factors that mitigate or increase the cost of a data breach. Another interesting analysis on data breaches published every year is Verizon’s “DBIR-Data Breach Investigations Report“, where the origin and main actors in a data breach are analyzed for different sectors, among other points. The following interesting conclusions, among others, can be drawn from the data in these reports:
- The cost of a data breach had the largest increase in 2021 from $3.86M in 2020 to $4.24M in 2021.
- The cost per register increased 10.3% from 2020 ($146 per registration) to 2021 ($161 per registration), increasing from 14.2% in 2017.
- The Top 5 countries/regions with the highest cost per breach are: USA, Middle East, Canada, Germany and Japan.
- By sector, Healthcare is the most affected over the last 11 years, increasing by 29.5% from 2020 to 2021. In the Top 5, followed by Finance, Pharmaceuticals, Technology and Energy.
- The most sought-after data types are credentials, followed by personal data, especially customer data, which is found in 44% of data breaches, followed by intellectual property and personal employee data.
- The most expensive type of data in a breach is personal customer data which has increased by 20% since 2020.
- The most frequent types of attacks to extract data are phishing, credential theft through hacking, ransomware, closely followed by configuration and human errors.
- Covid has increased attacks by phishing, ransomware, or credential theft.
- Ten percent of all data breaches, and growing unstoppably, involve ransomware that extracts unencrypted data and holds it for ransom.
- The main actor in a data breach is organized and financially motivated crime.
- The main attack patterns are social engineering (phishing, pretexting, scam), used to introduce ransomware, for example, followed by configuration errors, publication errors, human error, etc.
- Among the factors that most increase the cost of a data breach are, in this order, third party breaches (e.g., supply chain), compliance failures, or extensive migration to the cloud.
- Among the factors that most mitigate the cost of a data breach are, in this order, having an incident response team, extended use of encryption and performing incident response preparedness testing.
- Regarding remote work, the average cost of a breach is $1M higher where remote work was a factor that caused the breach, derived from the difficulty of detection and remediation costs.
- The greater the digital transformation of the organization, derived from Covid-19, the lower the cost of data breaches.
- The cost of a data breach is lower in organizations at more mature stages of a Zero-Trust
Strategies for quantifying the cost of a data breach
Quantifying the potential of a data breach can help a CISO justify the necessary investment in cybersecurity products and services in the organization. Given the knowledge of the organization and the potential risk of loss we could estimate not only how much a data breach would impact the organization, but also, the savings derived in certain prevention or mitigation measures that we can implement. Two of the methods that can be used to quantify the cost of a data breach are:
- Activity Based Costing (ABC): This method identifies the activities in an organization and assigns the cost of each activity to all products and services according to the actual consumption of each one. Of course, it is not a valid system only to identify the cost of a security breach, but this model can be followed to quantify it.
- FAIR Methodology (Factor Analysis of Information Risk): It is a methodology to quantify and manage risk in any organization. In fact, it is the only quantitative model and international standard for quantifying cybersecurity risks.
The following is a summary of both strategies for quantifying the cost of a security breach in an organization.
Activity Based Costing (ABC) Quantification
As discussed above, this method identifies the activities in an organization and assigns the cost of each activity to all products and services according to the actual consumption of each. About quantifying the cost of a data breach, four different cost centers or processes directly related to the management of a data breach in an organization can be identified. These cost centers are those that involve activities related to the:
- Detection and escalation of an information breach.
- Notification to affected parties.
- Response of the organization after the escape.
- Lost business derived from the breach.
Each of them has associated activities required by the company from detection to breach resolution, communication, etc. Depending on the cost center, these activities are:
The cost of a data breach is derived from the sum of the costs of the different activities summarized above. For this, it will be necessary to estimate the cost/hour of the people involved and to estimate the hours invested in the different activities. Additionally, there are costs derived from fines and possible hiring of legal advisors, etc. By estimating these costs, we will be able to derive a possible cost scenario for a data breach in our organization.
Cost quantification based on FAIR methodology
As mentioned above, FAIR (Factor Analysis of Information Risk) is the only international standard quantitative model for quantifying cybersecurity risks in an organization. It provides a model for understanding, analyzing and quantifying cyber risk in financial terms. The Open FAIR standard is maintained by The Open Group, a global consortium that enables the achievement of business objectives through IT standards.
FAIR complements other methodologies such as ITIL, ISO 27002: 2005 , COBIT , OCTAVE , etc. FAIR is also a risk management model developed by Jack H. Jones and driven by the FAIR Institute, a non-profit organization whose mission is to establish and promote risk management best practices to prepare risk professionals to collaborate with their business partners and strike the right balance between protecting the organization and managing the business. “Open Group publishes and maintains, among others, two relevant standards related to cybersecurity risk management and cost analysis:
- Standard for Risk Analysis (O-RA; The Open Group Standard for Risk Analysis): Provides a set of standards for different aspects of information security risk analysis.
- Risk Taxonomy Standard (O-RT; The Open Group Standard for Risk Taxonomy): Defines a taxonomy for the factors involved in information security risks.
A well-defined taxonomy allows for better measurement and/or estimation of information loss risk factor variables, and this is critical for the organization’s management to have the information necessary to make better informed and consistent data-driven decisions.
Risk taxonomy is divided into two branches:
- Loss Event Frequency (LEF): The probable frequency, given a range of time, that a threat will inflict damage on a resource, generating for example an exfiltration incident. In order to calculate this, it is necessary to take into account the:
- Threat Event Frequency (TEF), which refers to the probable frequency of a threat successfully or unsuccessfully acting on the resource. This in turn is affected by:
- Contact Frequency (CF) refers to the probability of a threat actually contacting the resource.
- Probability of Action (PoA) defined as the probability that the threat will act once it has contacted the resource.
- Vulnerability of the resource (Vuln – Vulnerability) which refers to the probability of a threat materializing into a loss of information. Vulnerability depends on:
- Threat Capability (TCap): Refers to the level of force that the threat can apply to the resource. For example, different types of malware or ransomware are more destructive than others.
- Resistance Stregth (RS) refers to how much strength a resource can resist a threat. For example, a strong strength password is not the same as the typical “1234”.
- Threat Event Frequency (TEF), which refers to the probable frequency of a threat successfully or unsuccessfully acting on the resource. This in turn is affected by:
- Loss Magnitude (LM): The probable magnitude of loss resulting from a loss event. A distinction is made between two types of loss:
- Primary Loss: This refers to the loss that occurs directly because of the action of the threat on the resource. For example, in a denial-of-service attack on the company’s web site, the loss would be the web site’s downtime, and the victim would be the owner, the company.
- Secondary Loss, which is loss that occurs to secondary stakeholders. If it is the website of a company that provides a CRM service, the affected parties may be, for example, the CRM customers. This is broken down into:
- Secondary Loss Event Frequency relative to the percentage of time a loss scenario may have secondary effects. For example, the length of time that customers have no service.
- Magnitude of Secondary Loss which represents the losses derived from dealing with the reactions of the injured parties. For example, loss of customers, fines, and lawsuits, etc.
Taking this taxonomy into account, FAIR risk analysis is based on four steps, which are described below with a practical example. In the Standard for Risk Analysis (O-RA; The Open Group Standard for Risk Analysis), data loss scenarios are decomposed based on the taxonomy (Frequency of Loss Events and Magnitude of Risk) along with prevention and mitigation controls, and the different functions of the NIST Cybersecurity Framework (CSF): Identify, Protect, Detect, Respond and Recover.
A case study of quantification following the FAIR methodology
For the sake of clarification, let’s take as an example the case of a global bank impacted by a ransomware attack in which documents containing personal information (PII-Personal Identification Information) and financial data (related to PCI regulation) are exfiltrated.
STAGE 1: Identify the components of the loss scenario
- Identification of the asset at risk: What is at risk. In the example, sensitive personal information and financial data stored on a bank’s file server.
- Identify the threat community: In the example, it could be an external actor or organized criminal group trying to access the bank’s systems by social engineering, apply ransomware and exfiltrate the documents before encrypting them in order to extort the bank and demand a ransom.
- Define the loss event: Here we could talk about the loss of confidentiality of files with personal data and PCI information.
STAGE 2: Evaluation of Loss Event Frequency (LEF)
- Estimation of the Threat Event Frequency (TEF): In this case we would estimate how many times an attacker may have contact via phishing with an employee to carry out the attack. It is something that can be complicated to estimate, but we can try to do it according to the following table and analyzing possible historical data from reports such as the one mentioned above from DBIR.
- Threat Capability Assessment (TCAP): this involves assessing the level, knowledge and experience, and resources, time and materials of the attacker in relation to the scenario. In this case, being a specialized criminal group, we could assume at least a High Threat Capacity (H).
- Estimated Resistance Strength (RS): It has to do with the ability to resist the threat. In this case, we could qualify it as High since the bank has anti-phishing measures to try to block these attacks, and EDR tools, however, once inside, the bank does not have encryption on the file server and the attacker could take the files in clear if he succeeds.
- Define Vulnerability (Vuln): Once the TCap and RS have been defined, we can extract the Vulnerability with the matrix below, which in this case is Moderate (M).
- Define the Loss Event Frequency (LEF): In the same way as with Vulnerability, we can derive it from the TEF (M) and Vulnerability (M). In our scenario it is Moderate (M).
STAGE 3: Loss Magnitude (LM) Evaluation
- Primary Loss Magnitude Estimation: In this scenario we could be talking about three potential threat actions (see all possible ones in O-RT): Misuse, Disclosure or Denial of Access (Destruction).
Focusing on the most likely one for an attacker who wants to exfiltrate data for financial gain and leaving aside the encryption part in order to deny access, we would be talking about Disclosure. We must quantify its impact on the different forms of loss for the primary actor of the loss (the bank itself).
For this quantification we can use the following table:
To assess the likely loss we can say that, in this scenario where we are not taking into account the effect of encryption or denial of access of ransomware but the exfiltration, it would have little impact on the productivity of the organization, which could continue with its operations, except for the disruption caused in the security and IT teams.
The main costs would be in the area of response since the cost-hour of the people involved in the investigation, incident management, internal communications, etc. must be quantified. In a conservative scenario, no less than 1000 hours would be invested at an average price of $100 per hour considering internal and external users.
- Secondary Loss Magnitude Estimation: The first thing to do is to identify those involved or affected. In this case customer data would be involved. It is also something that would affect regulators because it is a loss of PCI-related data with a high impact on a bank. Once those involved have been identified we can calculate the following for them:
- Estimation of the Secondary Loss Event Frequency (SLEF): Considering the group of people affected, in this scenario and following the table below we would say that the number of people involved that would have to be managed and informed is Very High (VH).
To derive the frequency of this loss probability estimate, we can use the following matrix relating it to the Primary Loss Event Frequency (LEF) calculated above (Moderate; M). In this case it would give a Very High SLEF (VH).
- Estimation of Secondary Loss Magnitude (SLM): To estimate the magnitude of secondary loss, we can establish a scenario where 100,000 customer records have been affected, for example, a large number, since the attacker has made efforts to get there and will try to obtain as much data as possible.
As forms of secondary loss, we can establish those related to the Response (costs of notifications, meetings, legal expenses, etc.). The magnitude can be obtained from the following table estimating the low and high range of probable cost (cost-hour of executives, legal expenses, etc.). In this case we could determine it as High (H).
Fines and lawsuits by regulators and customers and reputational cost can also be considered as a form of loss. It does not seem to affect competitive loss, and in this case, we have decided not to focus on the Productivity area
STAGE 4: Deriving and Articulating the Risk
- Derive the Primary Risk: We have derived LEF and LM, which in our case and based on previous analysis would be Moderate.
- Derive Secondary Risk: We have derived SLEF and Secondary LM, which in our case and based on previous analysis would be Very High (VH).
- Derive the Global Risk: We can combine the Primary and Secondary Risk to derive it. In this case we are talking about a Very High global risk.
Once the Global Risk has been estimated, we can quantify the cost of the breach based on the following table. In this case we could be talking about a severe cost to the business that could exceed $10M. With the Loss Event Frequency (LEF: Moderate in our case) and the Overall Risk Magnitude (LM; Very High in our case) we can estimate the Overall Risk based on the following table.
Analysis using Open FAIR’s Risk Analysis Tool
The Open Group offers a tool for quantifying the risk of data loss. This tool allows us to simulate minimum, most likely, and maximum valuations for a given scenario. It also allows us to set up a proposed improvement scenario and compare the Improvement Proposal with the Current Scenario to see the cost savings or how the cost of loss is mitigated. In the following document we can see an example of a risk analysis based on the FAIR methodology based on the previous tables and comparing it with the tool. If we use the Open FAIR tool by filling in the following values, in relation to what has been previously filled in:
It would give us that there is a 50% probability that such a problem would exceed $5M in losses.
Return on Investment (ROI) of implementing a data-centric security solution
In this scenario, we could propose an improvement proposal, through the implementation of an information protection and control solution with encryption capabilities such as SealPath. In this way, the exfiltrated files will be protected. We could estimate that with a good implementation a high percentage of the files, except for configuration errors, will be protected, so the level of protection will be very high. Simply considering that the Resistance Strength of the proposed solution to protect this type of threat increases notably, since the attacker can exfiltrate the files but not decrypt them, the probable cost of breach is minimized. If we consider the savings, the Return on Investment on this type of solution is amply justified. Do you want to learn more about how SealPath can help you in this and other cases to minimize the cost of a possible data breach? Contact us and with a simple demo we will show you how.
With this type of analysis, we can justify the Return on Investment in certain security tools. In this case, Information Protection and Control tools that make data exfiltrated by ransomware inaccessible.