Enterprise data protection implementation: why so tough?

Implementing security products, services and solutions in enterprises is intrinsically difficult, especially when coordinating several teams focused on their own respective objectives and priorities. This is made harder when you add data security to the mix.

 

Over the course of many conversations with security stakeholders, I’ve distilled the experience of implementing enterprise data protection into the following narrative about our fictional friend, Josh, which may sound familiar to many of you: Josh is the newest employee at ACME, a global software enterprise. He’s been hired as their Privacy Officer after checking off the boxes for a background in law, personal interest in software engineering and passion for privacy.

Josh has been charged with executing three projects, each of which require implementing enterprise data protection services:

  1. Implementing “right to erasure” or “ right to be forgotten” requests for consumers, as stipulated by regulations like GDPR and CCPA (California Consumer Privacy Act).

  2. Allowing the right groups of data analysts to access the data they need in order to do their jobs without exposing them to sensitive information or other PII.

  3. Preventing support engineers from accessing data of individuals from certain geographic regions, as stipulated by privacy regulations and risk assessment audits.

Josh can’t wait to get started and feels confident in his ability to succeed! As far as he’s concerned, it shouldn’t be too difficult — after all, it wasn’t too complicated for him to delete data from databases 15 years ago, when he worked in tech. Surely, new technologies have made the task much easier. He’s also confident that the rest of his projects are strictly intuitive and simple cut-and-dry initiatives.

 

So far, so good. Then, incredibly, Josh’s presentation to the executive team is a hit! He easily obtains their buy-in and, after a number of Zoom calls with the rest of the organization’s personnel, it appears that everyone is as excited as he is to get the ball rolling on these new privacy initiatives!

Now, let’s pause. Josh is one lucky Privacy Officer! It’s often quite difficult to obtain buy-in from upper management and even harder to get it from different organizational teams. Little does Josh know that his good fortune is about to run out:

 

Scattered data repositories

It doesn’t take long for Josh’s rosy vision of data protection implementation to dissolve into chaos. The first chink begins with his quest to understand where ACME’s enterprise data even is. He’s shocked to learn that, not only is anyone unable to tell him where it is ALL stored, many also fail to even locate its most important and valuable information. At best, the knowledge is spread across different departments and Josh doesn’t have a single source of truth to rely on. 

 

The organization’s enterprise data applications also fail him. Several RDBMS systems are used for different enterprise data applications, some on the corporate datacenter, some in a Platform-as-a-Service (PaaS), namely AWS RDS. The situation is far more complicated with the other data storage centers. The main bulk of data is used with Amazon Redshift on HDFS, as well as on S3 buckets queried by Redshift. However, some applications began on Snowflake data warehouse, and it’s not clear when and if the two data warehouses will merge. Moreover, one of the companies ACME recently acquired uses BigQuery as their data warehouse.

 

Technical deletion obstacles

The struggle continues in Josh’s mission to comply with consumer privacy acts and regulations and delete customer data. For RDBMS systems, this has to be done in a manner that prevents table locks. However, the vast majority of the enterprise’s big data repositories (data lakes & data warehouses) contain read-only data. This means that deleting enterprise data is now a matter of carrying out the significantly more complicated project of rewriting it. Finding a way to make consumer data irretrievable also proves fruitless.

 

Decentralized knowledge

Josh can’t even seem to catch a break with his third project of limiting analyst access to sensitive data. Josh is unable to find a single person in his organization that can walk him through all of the structures pertaining to the enterprise’s distributed data. Even after Josh assembles relevant people from three different time zones to drill into this, none of them can provide an inventory of sensitive data within their respective domains. Each one needs to defer this task to their teams in order to access the required data across different environments. This will take a very long time. It will be made longer after taking into account the enterprise’s various services like support ticket systems and CRMs that often hold customer data.

 

Josh attempts to run a preliminary scan to check where sensitive data is stored, but that too proves extremely difficult: ACME has collected and generated massive amounts of data, and as the data engineers explain, such a project will take far too long and cost far too much. Furthermore, due to the dynamic nature of ACME, even after finally generating a report outlining the enterprise’s sensitive data locations, disabling access on a granular level proves far more complicated than expected. Apparently, the access controls in the data warehouses are too blunt.

 

In addition to all of these woes, engineering management suddenly informs Josh that, though they sincerely appreciate the value of privacy and compliance, all of their teams are too busy at the moment. It is difficult for them to appreciate why the massive undertakings he’s asking of them are more important than their own priorities. Josh also notices that voicing concerns over the risk of a data breach or incompliance seems to lose its effectiveness as his time at ACME goes by.

 

On the brink of tears, Josh assembled his cross-organization team once again, this time for the regional separation project. When describing the problem he wishes to solve, he receives slightly amused stares from his audience. They explain that such a project is hardly realistic. For that to happen, they will need to separate enterprise data, a truly difficult and disruptive task, given that many of ACME’s systems rely on its data remaining in the same place and current format.

 

It finally dawns on our protagonist: implementing enterprise data security is a seemingly impossible task. And then the terror sets in: “… if we can't execute these very clear data protection projects, can we really sleep well at night? Knowing what I know, it’s not a matter of IF we’re breached, it’s a matter of WHEN”.

 

Dun dun DUNNNN....

 

Josh out, Ben, Chief Scientist at Satori in. We truly sympathize with the Josh’s of the world. What gets us pumped about going into work every day is knowing how what we’re doing can help relieve the frustration of so many Privacy Officers around the world. We’re on a mission to relieve that stress and quell those frustrations.

Satori’s product is designed to be non-intrusive (no agents, no scanning of large data stores), and allows us to solve all of these substantial challenging data protection issues Privacy Officers struggle with using simple and elegant solutions. All they need to do is deploy Satori, and granular access controls to databases, data warehouses and data lakes, which can be defined as business rules, can be set up almost instantaneously.

 

What now?

 

Writing this blogpost

I took the liberty of combining several different conversations I’ve had with several people of different positions and companies into one story. Though I’m referring to the protagonist of our story as a Privacy Officer, he reflects frustrations I’ve also heard from different titles, such as Director of Compliance, Head of Data Governance, CIO and more.