Data security is critical, and lowering data security risks is one of the top priorities for organizations. However, data engineering teams are already overwhelmed with the burden of data security projects and often feel that these projects come at the expense of their core engineering projects. In this post, we discuss the challenges that data engineering teams face when spending the majority of their time on data security projects.
Data Security Projects and the Data Engineering Team
Data engineering teams have a core set of projects and skills. These projects and skills are typically in areas where they have developed an expertise and understanding; and in which they enjoy working. For data engineers when we do “the work we came to do,” we expect a certain percentage of our time at work will be spent on other activities. Some of these other activities include meetings we don’t care about, assisting other teams, and performing tasks we don’t want to do.
Data security and reducing security risks is essential for organizations. However, a large burden of data security and privacy requirements, as well as access control, and compliance often falls to the data engineering team. While an incredibly important task, this is not one of the core engineering team projects, nor is it one they typically enjoy.
There are aspects of every job that we don’t like but we see a much deeper issue here. Spending so much time on data security projects may not only leave the data engineers dissatisfied with their job, but it also distracts and takes over their ability to focus on their core projects.
This creates a tension between the data engineer and the company’s objective. For the data engineer they know that they are working on something important, but this is not what they want to work on and it is not related to developing their core abilities and skills. From the organization’s perspective this is a critical need but using engineers to fill this need is not the best use of resources; resulting in productive inefficiencies.
Examples Of Data Security Tasks Handled By Data Teams
There are several important data security tasks that data engineering teams provide help and support for that take them away from their core engineering projects. These include, but are not limited to data anonymization, authorizing users to access data, and managing data access audits.
Data anonymization projects are actually complex and time consuming projects for data engineering teams. Data engineers are responsible for tasks such as data masking, creating row-level security, and others so that they can prepare and deliver datasets that contain sensitive data to the data analytics and other teams.
While this process may appear straight forward there are a number of moving parts. The first is that the data engineers must be familiar with, and constantly up-to-date on the security parameters to keep pace with changing security policies and compliance requirements.
Further, data is often stored across different platforms (for example: MySQL, Snowflake, Redshift, or Athena) each of which require different coding, inputs, and technology. This complicates not only the initial data procedures but also maintenance in the long run.
As part of maintaining security of the datasets the engineering team must continuously update access controls and masking as requirements change. Data masking becomes a burdensome task that can often overtake the engineering teams time.
Authorizing Users To Access Data
One aspect of the securitization of data is that users are granted access through either RBAC or ABAC. However, for users to gain and maintain their access controls often requires an approval process. Each user that wants to access data sends a request that ends up going to the data engineer who has to grant approval. In many cases, this is becoming an IT project where data engineers find themselves handling support tickets as a “rubber stamp”. While this is necessary to ensure that sensitive data is protected, it is often a time-consuming and inefficient use of the data engineer’s time.
Managing Data Access Audits
Different users require different anonymization processes. In theory data masking should be differentiated for different users based on their RBAC. However, not only are users often identified in the same way across data technologies. Sometimes these users also share common local user login information. Therefore, data masking needs to be configured as an exception to RBAC, complicating the RBAC design, and the data masking process.
Why Do Data Teams Execute Security Tasks?
I have personally had many discussions with security leaders about this topic. The consensus is that data access is a black box, or at least a vague area for security teams. Whereas in network security and endpoint security there’s relatively high visibility, when it comes to data access (or even “when it steps into the world of SQL”), security teams are often outside of their comfort zone or control, and therefore have to rely on data engineers to translate their requirements into code.
What’s Wrong With That?
We acknowledge that data security is a very high priority for organizations. It is necessary to ensure that the organization is in compliance while allowing data to be shared across the organization so that it can be more productive. However, when data engineering teams are responsibility for developing and maintaining data security this results in a number of problems:
- Data engineers are unsatisfied with their jobs. This can have a significant impact on the organization. Unsatisfied workers are more likely to leave an organization than satisfied workers, resulting in high turnover costs.
- As data is fueling the business, Gartner predicts that by 2023, “organizations that promote data sharing will outperform their peers.” When the teams that should fuel the business are busy implementing security and compliance requirements instead, that’s a problem for the business.
- In many cases there are gaps between the perception of security by the security, governance, privacy and compliance teams and what actually goes on in with data access (because they don’t have visibility into the actual policies).