Discover and Classify Data

Enterprises Prioritize Data Classification in 2023

Andy Singer

CMO

April 14, 2023

Having surveyed over 350 IT cybersecurity and DevOps professionals responsible for cloud data security at mid-size and Enterprise companies, the Enterprise Security Group 2023 Cloud Data Security Survey results show that security teams are focused on solving data security challenges in the coming year.

While the full report has several key takeaways, two regarding data classification stand out. First, a whopping 92% of respondents said they were confident in their ability to classify data. Yet, a third suffered data loss due to misclassification in the last 12 months. Second, 70% of respondents prefer reading 100% of all data to ensure correct classification.

Data Visibility for Security is Essential

92% are confident in their ability to classify data, yet 1/3 have suffered data loss due to misclassification in the last 12 months

How many times have we seen businesses admit they didn't entirely know the types and amount of data lost in a breach, yet they believed it was all handled according to company policies? While misconfigurations often grab the headlines, the data from the ESG survey indicates data misclassification may also play a part.

Factors contributing to data misclassification include policy complexity, the variable nature of data, and a need for more security visibility. According to Gartner, "The majority of organizations have complex classification schemes and data-handling documents that are difficult to communicate and implement." Complexity is often the root cause of most security issues. In the case of data misclassification, complexity can potentially lead to deploying a low or no security posture.

Data is also highly variable. It changes, hides in the shadows, and takes on many unforeseen combinations. An API might be pulling back more data types than intended. An application might unknowingly fire off log files with sensitive information for debugging. Multiple personally identifiable data attributes might unintentionally reside in the same line or row of data. While simplifying policies, improving classification, and tightening how applications handle and store data are all necessary for improving processes, companies should also focus on providing security teams with complete and continuous visibility into data stores and highly-accurate classification capabilities that function at scale and don't break the budget. With visibility, security teams can regularly validate that documented classifications match the data inside the target data store, discover sensitive data repositories hiding in the shadows, spot high-risk data combinations, eliminate data misclassifications, and improve data security posture.

Sampling Data Creates Blind Spots Security Can't Afford

70% of responders want to read 100% of every file, object, database, and other cloud data

The more visibility and control security teams have, the better they can prevent data loss. Unsurprisingly, 70% of respondents said they prefer to scan 100% of the data during classification rather than using statistical sampling. Most likely due to an understanding of sampling's drawbacks learned from experience.

While sampling methods work well for structured data, they don't work for unstructured data. We discuss this in detail in our blog Sampling Unstructured Data Brings Risk of Silent Failure. Relying on sampling methods to scan unstructured data means missing pieces that may contain sensitive information. For example, there are often significant differences in the kinds of information that exist at the start of a PDF or a log file versus the middle and end. These issues contribute to data misclassifications and gaps in security posture.

Open Raven's cloud-native data classification platform was designed with both structured and unstructured data in mind, cleanly solving the issues in classifying unstructured data without trading accuracy for throughput. Open Raven performs accurate, cost-effective, highly customizable, and complete data classification at petabyte scale.

The ESG report surveyed respondents on other topics, including public cloud usage, top cloud security challenges, 2023 spending priorities, and more. Download the report and see how your organization's ability to discover and classify data compares. If you want to learn more about the importance of data classification in modern cloud infrastructures, read our ebook.

Return to the blog

Ready to get started?

Request demo

Enterprises Prioritize Data Classification in 2023

Data Visibility for Security is Essential

Sampling Data Creates Blind Spots Security Can't Afford

Get stories about data and cloud security, straight to your inbox.

Sampling Unstructured Data Brings Risk of Silent Failure

Safe and Private Cloud Data Classification Without Backhauling

Ready to get started?