Without getting into cliches such as “Data is the new oil”, most companies are well aware of the value they’re getting from data. Organizations are also well versed in the obvious costs of data as these appear on their balance sheets. However, some of the costs of data are often overlooked, and discussing them is important.
Not all of the costs related to data are apparent or well understood. Further, to maintain the quality and security of data, there are a number of factors to consider each with their own associated costs. Uncovering hidden costs can help organizations determine their true data costs and we will explore some ways to reduce these hidden costs. In this article we will explore:
Obvious Costs of Data
Data is only valuable if you can efficiently store, access, and secure it. There are obvious costs associated with it including the data infrastructure needed to store and process the data. These costs may include hardware, software and datacenter maintenance for on-premise data operations, or cloud costs when you’re using cloud data platforms. This cost includes not only the storage of the data, but also backups, data pipelines, ETL/ELTs, network costs, and more.
There are also costs around data security. Examples may be firewalls and patch management for the databases you maintain, or security controls you may use for cloud services.
Another obvious cost is that of extracting value from the data by using BI tools. This may include licenses or subscription fees, as well as employee training for better data literacy.
These are obvious costs of security, we will discuss some of the hidden security costs below.
Hidden Costs of Data
While some costs are apparent to organizations there are some costs that are concealed and these costs can add up quickly. Let’s explore some of the hidden costs of data, and what can be done to reduce them:
Finding And Getting Access To The Data You Need
As datasets increase in size and complexity, data users can spend their limited time and resources searching for data. This cost is significant, as it is wasted employee time, as well as (and often more importantly) delays in data projects, or simply poor results.
In addition, sensitive data is typically restricted and there is an inherent complexity when allowing users to access this data. Users are restricted from accessing sensitive data until their request for access is approved by the relevant department. The wait time for approval before utilization of the data, can significantly diminish workflow, resulting in lost productivity.
In other words, if it takes a data analyst days or weeks to understand what datasets can help them provide value, and get access to those datasets, this is not ideal.
Reducing this cost is done by having proper metadata available to your data users. Ideally, you would have a data portal or a “data mart” where data consumers can both understand what data is available, as well as get access to it.
The size and structure of data can result in numerous inefficiencies, each of which has an associated cost that accumulates. Large amounts of data spread across several platforms increases the time and effort for data engineers who need to write additional code to access this data. Your data lake may become a data swamp, causing challenges in accessing and managing data, as well as having security implications.
Another common inefficiency is manually updating metadata and data models in data catalogs, when some of this work can be automated. This labor intensive process diverts data teams attention to ensure that this alignment occurs on a regular basis.
We discussed some of the security costs above, but other security costs are often overlooked such as meeting specific security requirements.
A good example is handling sensitive data. There are common security policies to be configured specifically for locations where sensitive data is stored. This process requires that the sensitive data is first mapped and then configured to the specific access policies. Periodically mapping and configuring these sensitive data is inefficient due to its time consuming and resource intensive nature. The cost of mapping is further exacerbated by constantly changing data across data platforms.
Another typical example is the need to anonymize data based on the role of the data consumer, often done using dynamic data masking. In many cases, there is a cost associated with implementing such access controls. This includes locating the data types that require anonymization, and data engineering time enabling and maintaining these capabilities.
Finally, granting and revoking access to data according to security policies by data engineering teams is often both a time hog, as well as a hidden cost in data engineering resources spent.
Reducing such costs can be achieved by having clear security policies and better collaboration between security and data teams. In addition, in most cases, using a data security platform that can automate many of these processes can be valuable both in reducing data engineering costs, as well as in reducing time-to-value for data users.
Finally, there are hidden costs in compliance. Gathering the required information for an audit requires significant efforts from various data teams which increases the cost of compliance. For example, creating reports of new PII locations, the users accessing PII, or the controls applied on sensitive data access. Other examples can be the costs of configuring, storing and maintaining your data access logs.
As well as being a hidden cost, this often incurs unplanned projects for data teams, which can distract them from value creating projects.
Satori Helps Reduce Data Costs
Satori helps reduce many of the hidden costs of data we mentioned above. When it comes to finding and getting access to data, Satori simplifies granting and removing access, including capabilities to provide temporary data access and self-service data access using a data portal or Slack integration. This reduces time-to-value, as well as eliminates the hidden data engineering costs.
When it comes to security, Satori enables you to have a DataSecOps platform that continuously detects sensitive information, applies security policies across all your data access, and performs that without dependency on data engineering resources.
Finally, Satori keeps you continuously compliant, with audits of all access to sensitive data, including reporting by identities.
Costs of data can be plain to see, but some of them are sneaky or overlooked. This distorts your budget planning, as well as slows down your time-to-value from data and disrupts your business with unplanned projects.
Satori is here to help with that and reduce many of these costs, in a simple way that would also accelerate your time-to-value from data projects.
To learn more about Satori, book a demo with one of our experts.