DLP Truth & Consequences
DLP Truth & Consequences

DLP Truth & Consequences

HOW VISIBILITY WILL CHANGE YOUR DATA PROTECTION STRATEGY

EXECUTIVE SUMMARY

While the data loss prevention (DLP) vendor landscape has changed over the last decade, the technology approach and architecture largely remain the same. Aside from some minimal feature innovation, by and large, leading DLP vendors still hold to the three-pronged coverage approach of network, discovery and endpoint. The old adage, “If it ain’t broke, don’t fix it,” is the mantra of most DLP vendors and the marketplace doesn’t seem to be challenging this mindset. But, if you consider the growing list of recent – and very major – data breaches, it’s hard to argue that the traditional DLP approach is working as effectively as expected.

Of course, there will always be some amount of data loss, but with the level of massive data breaches we’ve seen in recent years, you would expect the marketplace to demand new technologies that can protect data more effectively. DLP vendors and their technologies have reached a level of maturity and there is little room for the innovation required for increased data protection effectiveness.

Over the course of 12 years focused exclusively on DLP and other data protection technologies, we have found the key to improving data protection effectiveness is data activity visibility, as explained further in this paper. Data visibility technologies and services, combined with existing data protection technologies, can be used to drastically increase data protection effectiveness.?

FOUR TRUTHS ABOUT DLP

Like most things in life, DLP technologies are not perfect. For the purposes of this paper, we’ll review four key truths about DLP that have significant and lasting impact on the technology’s capacity to protect data. While these truths are not disputed by DLP vendors, they are certainly areas less-visited by vendor sales teams. Please note that there are some vendors with slightly different approaches to DLP that allow them to avoid some of these challenges.

1. Content-Focused Data Detection

Traditional DLP solutions focus almost exclusively on inspection of textual content. For example, DLP solutions inspect text to detect regular expression pattern matches, such as 555-55-5555 (US social security number) or 4444 4444 4444 4444 (credit card number). Or does the text include identifying keywords like “social security” or “credit card.”

Unfortunately, this can be a disadvantage in a number of scenarios. If the file in question includes no textual content, as might be the case with some intellectual property in images or CAD drawings, DLP often has a hard time determining data sensitivity. Some DLP solutions have OCR capabilities, but these work only in select situations and can be error-prone. Perhaps more importantly, users can simply change content to hide sensitive data. Defeating content aware sensitive data detection methods can be as easy as adding some X’s and O’s:

  • Changing a last name from “Johnson” to “JohnXson”
  • Changing a social security number from “555-55-5555” to “555-5O5-5555”?

This minor content manipulation easily bypasses detection methods of leading DLP technologies. Widely-used pattern-matching and exact data matching methods are both broken by the addition of these simple characters, rendering DLP totally ineffective and unable to prevent loss of critical data.

Because DLP focuses so heavily on content, contextual elements are used far less often and are limited to things like file type, sender/recipient, source/destination, etc. When used properly, context-based detection can provide great insight, allowing detection based on things such as the network share the file came from, file iterations, etc.

DLP vendors have worked hard to develop innovative sensitive data detection methods to improve detection accuracy while also minimizing false negatives. Today, enterprise DLP solutions employ the most varied and effective methods for detecting sensitive data. In spite of their best efforts, however, no content-focused detection methods are a match against a malicious actor bent on stealing sensitive data.

2. Inspects Only On Data Egress Events

Traditional DLP solutions are designed specifically to prevent data loss. As such, the only times DLP solutions even try to inspect content for sensitive information is at the point of attempted data egress. These egress events are limited to such things as data movement out of the network (via email, web, etc.), movement to removable storage devices or via copy/paste or print action. Most other data-related activity – the vast majority, mind you – goes not only unlogged by DLP, but no attempts are even made to inspect it in the first place. While at first blush this may seem surprising, it’s actually not entirely unexpected considering the primary role of DLP is to prevent data from leaving.

That being said, to inspect data only at the point of egress has major negative implications. Non-egress data activity events eclipse egress events by a huge factor. The number is probably higher than 1000 non-egress events to 1 egress event. With this in mind, for every 1 egress event, there may be 1000 other data events that are not inspected or tracked by DLP. From this perspective, it’s easy to see that DLP visibility is very limited in terms of total data activity.

DLP vendors may argue that their focus is preventing data loss and not data activity tracking. This is a valid point, but the question remains: is there value in tracking this data activity? We believe there is not only value in inspecting and tracking non-egress data activity, but the practice will prove critical to improving data security. This added visibility provides valuable insight and helps identify data activity that may eventually result in sensitive data egress.

3. Requires Known Policies

To make DLP solutions effective, policies must first be created to tell the solution to look for sensitive data leaving (and at egress only as explained above) and under certain conditions. Policy makers must know ahead of time – based on clear and obvious need, past experience or industry best practices – what data to protect. Then they must consider all the ways that data could be leaked and create policies to match every possible scenario. While many required policies are quite obvious, many others are not so clear cut. Some may even be completely unanticipated.

Well-designed policies require a combination of both foreknowledge and guesswork or creative thinking to provide comprehensive coverage. Still, in the beginning, all policy creation is based on supposition, not factual evidence that data is leaving the organization in one way or another.?

Essentially, DLP solutions provide protection against data loss threats administrators are already aware of. Hence, DLP proves what you already know. What DLP cannot do is show you what you don’t know. This will prove to be a major factor in effective – versus ineffective – data protection.?

4. Logs Only Policy Violations

DLP solutions are designed to stop and examine everything leaving the protected network or endpoint. Assuming proper deployment, solutions are very effective at holding requests until inspection occurs. Nothing gets by the DLP solution without being stopped and examined at the network gateway or on the endpoint. If sensitive data is detected contrary to policy, that data is prevented from leaving and DLP logs an event in the form of an incident or policy violation.

But what happens to non-policy violations; those times when sensitive data is not detected? Many are surprised to learn that when no sensitive data is detected, not only is the data allowed to leave, but event data is not retained in any form. All information surrounding non-policy violation activity is discarded by the DLP solution, regardless of whether it’s an email, web request or other protocol. While event data may be logged by some other technology, from the perspective of the DLP tool, it’s as if the activity never happened.

The quick solution would be for DLP solutions to simply start logging all activity. But since non-policy violations – the purported “acceptable” data activity – make up the vast majority of all data activity, the logging of all data could result in millions of events in a very short period of time. Logging that level of activity simply could not be supported by traditional DLP solutions for two reasons:

  • They were not designed to support that many events. This has been proven in many DLP deployments with poorly-written policies that produce millions of events, causing the DLP solution to crash.
  • Incident response teams cannot effectively handle that many incidents. Even if they could, it would largely be time wasted since the majority of incidents are non-policy violations and unlikely to provide much value. Plus, DLP solutions themselves do not support the data analytics to draw meaningful conclusions from the information.?

YOU DON’T KNOW WHAT YOU DON’T KNOW – AND IT WILL HURT YOU

In cases where the absence of sensitive data is accurately confirmed (a true negative), discarding that activity data may not have a huge negative impact, at least not in the short term. In the long run, however, we have seen that activity information has significant value to show normal or baseline data use. Establishing normal or baseline activity allows for comparisons to identify anomalous activity to uncover threats to sensitive data. Think in terms of the benefits of User & Entity Behavior Analytics (UEBA) solutions.

In the event sensitive data is actually present and the DLP solution simply failed to detect it, dumping that activity data is a terrible mistake. Discarding these non-policy violations erases all forensic hope of identifying that data loss, and, this is key: the possibility of correcting it by putting new, more accurate and effective policies in place.

Non-policy violations will one day prove to be the most important of all data. Data that has been stopped, inspected and determined to be non-sensitive has the greatest potential negative impact on an organization. How is that? Hidden among millions of non-policy violations are also the false negatives; the proverbial needles that were not detected in the haystack. By discarding non-policy violations, all false negatives are also thrown out, like the baby in the bath water.?

THE FALSE NEGATIVES ARE THE DATA BREACHES

DLP false negatives represent activity where sensitive data is actually present but goes undetected by the DLP solution – and leaks. Each DLP false negative is actually an incident of data loss. Simply put: the false negatives are the data breaches. In cases of data breaches that occur while comprehensive DLP technologies are in place, it may be that the DLP solution did in fact inspect that activity, but simply did not detect any sensitive data.

The failure to detect sensitive data may be the result of poor policy configuration or it may be the result of data manipulation and obfuscation by a malicious actor. But it doesn’t really matter. The fact is, the event was a false negative and resulted in a data breach.?

EXAMPLE OF DATA EGRESS EVENT

Let’s summarize the real impact of the DLP challenges by walking through a very simple scenario, along with each of the four DLP truths.

  1. Content-Focused Data Detection – User manipulates sensitive data to hide it. Since DLP is focused on content inspection, user avoids DLP detection.
  2. Inspects Only On Data Egress Events – DLP is designed only to inspect for sensitive data on data egress, so user’s pre-egress activity – and sensitive data manipulation – is not detected.
  3. Requires Known Policies – No DLP policy could have detected this activity because DLP simply has no visibility into activity prior to data egress.
  4. Logs Only Policy Violations – At the point of attempted egress of this data, since nothing sensitive was detected, all relevant event information is discarded by the DLP solution, leaving no forensic information that might lead to future detection – and policy correction.?

VISIBILITY IS REQUIRED FOR DATA PROTECTION EFFECTIVENESS

We propose the use of data activity visibility tools as complements to traditional data security technologies, as a means for data protection strategy improvement. Data activity tracking tools can provide instant visibility into all enterprise data use and activity, such as: all data flowing through an organization, tracking data as it’s created or enters the enterprise (regardless of data format), identifying both creators and consumers, monitoring both internal and external data movement, following all meaningful data changes and iterations, detecting and logging the presence of sensitive data (PII, PCI, HIPAA, IP, etc.), tracking all data egress (regardless of the presence of sensitive data) and movement to and from SaaS and Cloud applications.

Compiling all this data allows for instant and real time reports based on any data element: data type, format, content, origination, destination, user, cloud app, email activity, web activity, location and the complete data lifecycle.?

HOW DATA ACTIVITY VISIBILITY WILL CHANGE DLP STRATEGY

This data activity visibility provides critical insight for organizations with mature data protection programs as well as those considering new or evolving programs.

Mature, Existing Data Protection Programs

Even though they may not know it, organizations with long-standing, mature data protection programs likely suffer from the inability to validate the effectiveness of their program beyond the occasional blocked incident. But the ability to show data egress attempts being blocked really provides no validation that the program is working effectively. Just because the solution blocked 20 data egress events today doesn’t mean it didn’t miss 100 others (non-policy violations that were also false negatives).

Trying to validate the effectiveness of a DLP solution using the output of that same DLP solution is akin to performing a ballot recount using the original automated process. The outcome is unlikely to be any different – or of any value. Using a different counting method – like a hand count – is the only sure way of validating the original effort.

In the case of DLP, it’s even worse because the DLP solution is working with a data set that represents a tiny fraction of the total data activity – like a hand count with only 1% of precincts reporting. So, out of 1000 total data activity events, DLP only shows the small percentage that happened to be data egress attempts (say 1% of the 1000 = 10) and then the even smaller percentage of events in which sensitive data was detected and resulted in a policy violation (say 10% of the 10 = 1 of the original 1000, or .1%). That’s very poor overall visibility for a solution designed to protect data.

To make matters worse, this DLP data set represents the data you already knew to look for (and then turned into DLP policies). You’re only seeing what you already knew and don’t learn anything useful to actually improve data security.

By utilizing data visibility tools (DVT) with DLP, an organization can absolutely validate the effectiveness of a data protection program in very objective terms. The service reports back all data egress events along with all other data activity – not only those in which DLP detected sensitive data and which were blocked. Comparing the output of DLP to data visibility tools, the value of enhanced data visibility becomes quickly apparent, as shown in the table below.?

No alt text provided for this image

With DVT, all events (including non-egress) can be further broken down into dozens of reports showing things such as:

  • All data movement by source or destination
  • All data movement by file, type, content
  • All files downloaded from sensitive data repositories
  • All files containing sensitive data or keywords
  • All files originating in network shares moved to external websites/cloud
  • All data flows to cloud storage, personal email or social media
  • All files containing regulated data (PII, PHI for PCI, GDPR, CCPA, PIPEDA, NIST)
  • All files containing intellectual property based on content and more importantly context
  • All cross-domain file sharing (e.g. Customer A data shared with Customer B)
  • Specific user file movement?

Using this visibility, organizations with mature data protection programs can identify holes in their strategy, even down to very specific data risks, like a user sending files containing PII to another user, or a group of users downloading files containing sensitive data from a particular network share. With this detailed information, new required technologies can be identified and data protection policies can be created to prevent this type of activity in the future.

In the final analysis, new or improved policy requirements identified through the findings of data visibility tools are grounded in fact, so they are especially accurate and represent the unique needs of the organization.

This is precisely how you use visibility to improve your data protection posture.

New or Evolving Data Protection Programs

For organizations considering the creation of new data protection programs, data visibility tools provide factual evidence upon which to 1) base proposed data security strategies, 2) get executive buy-in and 3) consider the right technologies to address those data risk facts. For lack of real data indicating a problem – like an uncovered data breach – many organizations simply choose not to act. If you can’t prove the need for data protection technologies, why spend the money?

Consider this: If an organization knew – factually – that critical intellectual property was going to be stolen and sold to a competitor, what would they be willing to spend to prevent it? If the immediate financial impact of this IP loss would be $10 million, would it make sense to spend $250k to prevent it?

Conversely, if the organization knew – factually – that their critical intellectual property was well-protected and not threatened, could they put off a $250k technology spend? Maybe. Maybe not. But isn’t it better to have the choice?

The fact is most companies are blind to the state of their sensitive data, or any data for that matter. They don’t know whether data is under active attack or sitting snugly in the protected network. Without data visibility tools or services specifically for this purpose, it remains a guessing game.?

Data visibility tools specifically designed to uncover all data activity, provide visibility for new data protection programs that will prove factually to executive management:

  • specific needs in order to get the support for data security strategies.
  • the true data protection posture so plans can be put in place for the best response, with meaningful policies,
  • protections and controls.
  • the technologies required to address threats to data. Note that results may show many areas of need; or just as
  • easily show that existing technologies provide adequate coverage.
  • what data is actually at risk, to make a suitable determination on how to appropriately respond to that risk.?

DATA VISIBILITY TOOLS AND HOW TO LEVERAGE THEM

DLP Experts provides data visibility tools and services – including no-cost trials – to qualifying organizations in order to prove the concept across a portion of the enterprise. After the technology has been proven, the tools can be rolled out to a set number of users in the organization for a term of between 1 and 12 months. Service costs are based on user counts over a monthly term and are designed to have a lower impact on organizational budgets than a capital purchase. After completion of the term, the service can be continued as desired.

Data visibility services include:

  • Project Design and Scoping
  • Technology Deployment
  • Data Activity Collection
  • Data Analysis and Correlation
  • Regular Reports – Weekly, and Immediate for Critical Events
  • Comprehensive Findings Report

Data visibility tools come in a number of forms from lightweight agent SaaS-based solutions to on premise network- based solutions. All provide unique visibility and advantages and when paired with existing data protection technologies, like DLP, provide increased visibility that has the power to change data protection strategies.

THE FINAL ANALYSIS

Whether your organization has a mature data protection program in place or you are just beginning to research and build out your data protection strategies, additional visibility will help you effectively protect your data and improve your data security posture. Visibility can come from a number of different sources, but it needs to be collected, reviewed and correlated in order to garner meaningful benefit from it.

Contact us for more information and a service demonstration at [email protected].?

要查看或添加评论,请登录

社区洞察

其他会员也浏览了