In an era of general doom and gloom, I bring you some good news to hopefully brighten up your week. The Open Raven Platform is another week closer to general availability (GA) and new functionality has been added into this week's release (0.8). Each Tuesday we'll be pushing features, bug fixes and tweaks and we plan to be code complete on November 3rd, with GA on November 17th.
I'll be posting regular blog updates until we go GA but invite you to be "in da club" and you can try out these features now. Just go to your cluster URL, i.e. acmecorp.openraven.net/dev, and you will see the week's feature flags. Toggle them on and off and have a play. It's that simple. If you want a guided tour or help, just contact us or mail us (firstname.lastname@example.org) and we’ll hop on a Zoom together. And yes, I did see the news this week and no it won't be that kind of Zoom. Next week, I’m threatening to include a video narrating some features so get ready (or beware).
The first thing you will be able to play with is Data Classification. Cheaper, faster and better than AWS Macie. There I said it. Sorry AWS marketing but it's true. Open Raven deploys as an AWS Lambda function, so “it scales”, better because you can customize data collections and data classes, much better data matching and you can actually validate that the data is real using the validation API versus just matching a pattern. Think about finding an AWS key and using the validation function to tell you what account it was for. Oh yeah!
In this week's release, we have a small number of data classes (names, addresses, SSN's) in the general PII category and will ship a full suite of PII, PHI, Financial and Credentials for US and UK geos at GA. Healthcare and some credentials are already in QA. Support for text (including JSON, etc.) and office files are in the build now, compressed and all the others that Macie supports in the coming weeks with Apache Parquet and Avro files as a very fast follow. When we have Parquet, you will be able to look at big data from Spark, Amazon EMR, etc., and if you want to investigate Amazon AppFlow you can pipe Slack, SFDC, logs, etc. into an S3 bucket and classify the data from your SaaS apps or app log files. Cool eh? Yes, it is. We even know how to open images, do OCR with tesseract and classify the contents - a later build, but we'll do it. No more dumps of scanned credit cards, health records or employee files on your watch!
Talking of data, we have the boffins (British phrase for propeller heads – US equivalent is geeks) working on a tool to generate very large data sets so you can test Open Raven. I’ll update you next week on progress and then if you are interested in getting your hands on a load of fake (but very realistic) data, let me know. Why there has never been an open-source project to create realistic fake data to test with, versus trying to anonymize customer records is beyond me. Another winter project, I guess.
Our engineering team (who really are amazing) have also added a feature that some people said wasn’t possible. You miss 100% of the shots on goal you never take right? Now, when we initially analyze the buckets to see what files match the criteria we want to classify, we index every single file into an Elastic index. This means you now have the ability to search for files across your S3 fleet just like you would search from your laptop file finder. Search for file duplicates, files of a certain type or name and even of a specific size. And here is the cool part, it supports up to 1 trillion files! You can also see how much data you have, how much in what type of files, what locations etc. We'll wrap that up into a pretty report or visualization for you as soon as we are code complete, but it's super cool and very useful. And as if that isn't enough, I have added a teaser of the Live 3D maps that will ship with Platform 1.0. Give them a look. What started as a vanity project has become truly useful. You can see the security groups, peering relationships and account connectivity as well as security policy violations live as they happen. Instantly see all external connections to your AWS. SimCity for data security. Reserve some space on your SOC wall now!
Finally based on feedback from our users, we are moving to a pure SaaS model where Open Raven hosts the software for you. We will still support the current VPC model for customers that really need it, but this will make deployment and management easier for everyone. If you want us to convert your current deployment to the new SaaS hosting, just let us know. All new trials moving forward from next Tuesday will default to SaaS. And if you’re interested in bugs, nits and minutia (aka engineering reality), we've been working on them as well. See below, but just know we've got it covered, so get back to looking at the very pretty maps!