Sensitive Data Isn’t The Crown Jewels
I need to get something off my chest. It’s time for our beloved infosec community to retire the cliches permeating every slide deck about sensitive data—especially the notion that “sensitive data is the crown jewels of a company”. Please don’t misunderstand me, I can appreciate how this type of allegory applies to the value of sensitive data, especially since it’s often an immensely expensive asset that requires extra security measures and attention. However, the last thing we want is to keep it locked away behind bomb-proof glass with almost zero accessibility. Unlike diamonds, sensitive data isn’t forever. In fact, it’s hardly ever so well-defined and is put to almost daily use!
If Sensitive Data isn’t the Crown Jewels, then what is it?
Let’s take a closer look at the nature of sensitive enterprise data as well as some of its most common characteristics. This will help us make sense of where the crown jewel analogy doesn’t fit and why we must consider a different approach to its protection:
Unlike crown jewels, you use sensitive data regularly and often. Sensitive data doesn’t spend most of its time in an inaccessible bomb-proof case. It’s something that many people require regular use out of and access to. In fact, unlike jewels, its use directly ties into its value. The moment an enterprise no longer has use of sensitive data is the moment it stops being a valuable resource and becomes a notorious liability instead.
Sensitive data is a moving target. Far from the static image of a crown jewel stowed away in an impenetrable safe, sensitive data is often ingested through multiple pipelines to an organization’s data stores (i.e. data warehouses, data lakes, databases etc.). It isn’t uncommon for it to pop up in a new location either, due to data transformations or find itself in the wrong place or format due to human error. Different types of sensitive data may also be united for a specific project or left to its own devices after reaching obsolescence when data teams are unsure of how to diffuse it.
Boo! It’s right behind you. Seriously, sensitive data can be found where it’s least expected and infosec teams may never know it. Part of the challenge of protecting sensitive data is being able to track it down and control who can access it.
Managing access to sensitive data is rarely straightforward. Unlike the crown jewels, there isn’t a shortlist of people who should be able to access sensitive data. This is why managing access to sensitive data can be challenging, to say the least. Inflated permissions are pretty typical when multiple people require some sort of access to at least a portion of an enterprise’s sensitive data.
There isn’t one complete definition, prioritization and grouping of sensitive data. There is no Carat system in place for data. In each organization, these may differ according to the industry, regulations, maturity and types of data stored.
What’s the Takeaway?
Realizing what sensitive data is, and what it’s not, helps us better understand how to optimize its protection. We can’t stow sensitive data out of our own reach in an impenetrable safe with almost zero access, because we want cross organizational teams to be able to leverage it for innovation and business growth. However, this isn’t to say that we want to overextend access to sensitive data beyond what’s absolutely necessary. The requisition of access adds inherent risk to enterprises, and we must find ways to mitigate that risk as much as possible
While this is by no means comprehensive, below is an attempt at an agnostic list, based on the characteristics discussed above, of what infosec teams must consider when optimizing sensitive data protection.
You need to have continuous, up to date knowledge of the precise location(s) of where your sensitive data sits, and what types of sensitive data you have. The more granular the list, the better. Details can only contribute to better planning and informed decisions. For example, when handling a database with 100 tables that only contains 1 table of sensitive data, you can focus on protecting that one table. If that table has 100 columns, and only 3 contain sensitive data, it makes sense to make those the focus of your attention and efforts. Congrats! You just focused on the fraction of data that carries the most risk. Not only is this a far more efficient approach to risk management, you can now prioritize monitoring access to it and creating optimized access controls.
Expect the unexpected with “data-based access controls”. Access controls shouldn’t only be placed where sensitive data is kept—enterprises must also leverage access controls to detect when certain types of data are accessed. For example, if someone accesses data that contains PII, enterprises should be able to gain warning via an alert, mask that data or take other action.
A good way to reduce risk in access to data is to examine the permissions granted throughout an enterprise’s data stores (we are well aware that this is much easier said than done). We recommend comparing actual access to data with users and group privileges to discover if users have access to data they don’t use. Permissions in these instances should either be removed or be restricted via the implementation of “break glass” workflows (where a user musto specify a reason or obtain manager approval in order to access the data).
Finding the right balance between providing access to data, which is necessary for keeping a business functioning and growing, and giving too much access away is hard. Especially when you can’t spend unlimited resources on finding that balance, and when each data store technology offers different (and often complicated) access controls.
Nonetheless, it’s still incumbent on infosec teams to keep data access as granular as possible. For example: if you have a BigQuery data warehouse, you might consider setting access per table, not per dataset.
Data stores must be secure, with both built-in security controls as well as external security controls. If you’re looking at securing cloud data warehouses, I suggest you read our “Securing Snowflake DB” and “securing AWS Redshift” guides.
We can Help!
We love to problem solve and invite you to to discuss your infosec challenges with us.
We’re offering an innovative approach to protecting your data with a particular focus on sensitive data. We’re able to quickly scan your data access configuration and logs to provide a clear picture of the permission gaps in your organization, as well as provide fine-grained, transparent and easy to apply access controls. From there, we continuously monitor what goes on in your data stores, as we know that data, permissions and data access are all moving targets.
We also provide on-the-fly classification of sensitive data in your data stores, focusing on the data that is actually accessed. In addition to out-of-the-box data classification, you can define data types specific to your organization.
You can limit access to data based on the broadest granularity options in the industry (table based, column based, row based or even based on data types), and you can gain immediate security value without having to change anything in your data store itself, keeping your security decoupled from the data infrastructure itself.