Discover and Classify Data

Find and Act on Exposed Data

Chief Corvus Officer
July 7, 2021

Old instances of Oracle. Developer secrets publicly accessible in a bucket. Left-behind customer data. How do you find these before they become an incident?

Locating all of your data

Knowing what native data services you’re running is typically straightforward. The hard part is identifying what type of data services are running on generic compute such as AWS EC2. Often this is where that old Oracle server is waiting to die (and misconfigured). Or that test instance with MySQL was mistakenly left running.
Open Raven built DMAP, a machine-learning based fingerprinting service, so that all data services can be easily seen and unexpected results can be dealt with quickly. Spot it on the map and click through to the AWS Console for quick fixes.

Three EC2s with SharePoint, Oracle, and MySQL logos on top.
Automatically locate and easily identify data services – even on generic compute (e.g., AWS EC2).

Determining what data you have

Knowing what data sits inside all of your data services requires classification. Far too often data classification has been manual, costly or simply not possible for large amounts of data. You can classify massive data sets in order to understand what data you have using Open Raven. Start by picking the data classes of interest either individually or in a data collection. From there, set up a scan that optimizes for depth (“do everything”), completion time (“sample X%”) or other factors. Exclude files, search for specific file types only… have it your way. And get the full picture.

Data Classes table with options to drill down into financial, healthcare, personal, file formats, developer secrets, or custom classes.
Select from a wide range of existing data classes and collections, or create your own.

Spotting radioactive data

Data classification is sometimes all you need to point a problem. An all too common scenario is customer data left behind to “go radioactive” with risk when it was supposed to be removed long ago. Or payment card data found in places that accidentally expand the scope of your PCI audit. Radioactive data can be readily identified from Open Raven’s real-time map or you can create a rules-based policy for automated monitoring of future problems.

Identifying data exposure

Serious data leaks can be hard to detect. Finding a leak requires a detailed knowledge of both your infrastructure and your data, followed by analysis that detects a mismatch between the type of data and the safeguards in place. Open Raven builds the necessary context through location, inventory and classification then evaluates the results against a default or custom policy to identify data exposure problems.

Policy Violations showing open issues in Data Security Basics, Config Standards, New User Policy, Norwegian GDPR Checkup, and more.
From publicly accessibly developer secrets to personal data exposed in log files, Open Raven flags what infrastructure-only solutions do not.

Automation and action

Open Raven fits your existing response workflow through built-in integrations for Slack, GSuite, PagerDuty and more. Need a custom integration to make things work perfectly? Our firehose API and webhook features ensure we fit the way you already get things done. Automating future data exposure detection and response is simple. Create a schedule inside Open Raven for an area to be monitored, the desired policy, and the frequency to check for problems and you’re good to go.

Don't miss a post

Get stories about data and cloud security, straight to your inbox.