We all like having our cake and eating it too. In this article, we will discuss a changing paradigm in data sharing across organizations. This shift is due to the realization that in most cases your ability to make valuable use of data is affected by the level of risk-adverseness in the organization.
We will start by discussing the key benefits of a healthy data-sharing mindset in the organization. Then we will discuss why the traditional approach to data governance may not get you to a great data sharing ecosystem.
It is important to mention that this approach may appeal to some organizations, while others may find other approaches more effective, depending on the types of data they deal with, security and compliance requirements, and data use. However, it is going to affect many data-driven organizations. According to Gartner, 2022 Strategic Roadmap for Data Security Platform Convergence, 28, September 2021
, by 2025 30% of Gartner clients will protect their data using a “need to share” approach rather than the traditional “need to know” approach.
Key Benefits Of Great Data Sharing
To keep this simple, great data sharing will be defined as data that is available quickly and easily to a lot of users so they can analyze it and use it to improve the business outcomes for functions including customer service, retention, support, operations, marketing, and sales. This definition is taken from the concept of data democratization
, to enable as many people as necessary within the organization to access the data and turn it into meaningful business value.
In general, these are the main benefits of having great data sharing in your organization.
- When data is used by many data consumers in your organization, who are able to quickly turn it into value, you get a high ROI on the data. Your high ROI is also on your data infrastructure and data services investment.
- When you have great data sharing, your time-to-market and time-to-value of data products and projects are reduced. Depending on the industry, the data available, and the results of the data processing, this can have a significant impact on your business results.
- When your data sharing works well, you reduce bottlenecks and “waiting lines” for data. This makes your users such as data scientists, business analysts, and engineers happier and allows teams like data engineering and data governance to become more focused on their core responsibilities as opposed to manual administration and processing of data access requests.
Traditional Data Governance
is a broad term that refers to the organizational policies and procedures that govern data management. Data governance guarantees that data is safe and secure. It also ensures that the data provided is protected, reliable, documented, controlled, and evaluated.
Traditionally, organizations lean towards an “opt-in” approach when it comes to data sharing. This means that a data owner by default does not share the data with the rest of the organization. When requested specifically, in most cases, the data owner assesses the benefits of sharing the data versus the risks associated with sharing the data. Benefits are mostly different types of business outcomes such as improved marketing, customer care, or operations, while risks are mostly compliance and security risks (such as not meeting regulatory requirements, or data exposure).
This process takes time. In addition, from the time the data owner approves access to the shared dataset, it often takes additional time until this is technically enabled by the data teams such as data engineering, platform, and shared data services for sharing with the data consumers - data scientists, business analysts, engineers, etc.
From “Default To Know” to “Need To Know”
Many companies, especially when they’re in hyper-growth mode, are in a data access mode which can be defined as “default to know”. This means that data is accessible in an over-permissive way, and sometimes in an uncontrolled way. There are, of course, different shades of “default to know”, but this situation often incurs hidden costs in terms of security and compliance risks.
There comes a time when companies want to close such gaps. Closing such a gap is a transition that is often not easy to perform. New security controls have to be placed, new processes should be created, and new security policies should be applied. Users that were used to having free access to all of the data get limited access based on their role and responsibility.
These processes can be painful in many ways, such as -
- Changing security controls in production and analytics environments carries operational risks and costs.
- Such projects and processes should be understood, accepted, and executed by many different teams each with different goals and objectives.
- It is often difficult to get buy-in for such processes, especially when the business value is not always growth, at least not in a direct way. This depends on the situation, as in many cases such companies will reach a growth cap if they don’t meet certain data access control requirements.
The natural next step for companies is to move from “default to know” to “need to know”. That means that if before the transition, as an example, all (or most) data consumers had access to all (or most) of the data in the data stores, now that access will be limited based on their role within the organization (for example - customer success, engineer, marketing) and their specific responsibility (for example - Top 100 accounts for the United States or high network individuals residing in California). Additional examples include limiting access to sensitive data to only specific teams, data anonymization, and masking, and applying data localization policies (for example by applying row-level security
From “Need To Know” To “Need To Share”
The move we’re seeing happening, and expect to see more of in the future is organizations that are already in a “need to know” state of mind, and now realize that the focus should be on allowing more data accessibility in the organization, and be in a share-first mentality.
For organizations to reap the benefits of fast and effective data sharing – high ROI on data, reduced time-to-value, and organizational efficiency – a change in mindset and behavior is required. Here are the recommendations based on industry leaders who have balanced the risks and rewards of data sharing:
From Risk-Averse To Risk-Adjusted
The first step is to understand that instead of looking at how to mitigate all risks, then accept data sharing requests, the organization should begin from a position where data needs to be shared and proceed to mitigate the risks. A good example of this is having a continuous anonymization layer that makes sure that data that is shared is anonymized or masked to the required extent. It’s important to have masking that is role-based so that the customer success team can access the customer’s date of birth - month and date - so they can wish them Happy Birthday but do not have access to the year for the date of birth field, while other users can’t access any part of the date of birth as it’s not needed for their jobs.
From Opt-In To Opt-Out Data Sharing
This means that the default of the data in the organization is to be shared. This is often called “Open By Default”. This does not mean that organizations should give data up, but only that the default should be that data is shared.
Data owners, as well as other data stakeholders (data governance, data security, data privacy, and such teams), can and should place limitations on the data to be shared especially sensitive data. However, these should be the exceptions to the default. In other words, data owners should opt out of sharing the data, rather than opt-in to share it. This is unlike the common situation where data is siloed unless specified otherwise.
Taking (Some) Power Away From Data Owners
Traditionally, data owners are getting share requests from other teams in the organization, and decide, sometimes after consulting with others, whether to share the data or not. They mostly make this decision based on a calculation of risk vs value.
The problem with this approach is that data owners, as well as data creators, often have a narrower point of view about the risks involved with sharing the data, but most importantly with the value derived from sharing the data. Data owners are often biased in their views about the risks and values.
Organizations need to take away some of the power of data owners. By making them specifically request specific datasets not to be shared, they are forced to have a “default to know” state of mind.
Clear & Transparent Security, Governance & Privacy Policies
An organization that uses “default to know” access controls can’t exist without taking necessary steps to keep risk at bay even when data is shared. This is done by making the “rules of engagement” around data sharing very clear. Some of the expectations include:
- Knowing where sensitive data is, on a continuous basis. Otherwise, the data exposure risk may override the want to democratize data in the organization.
- The ability to have agile access control. This means for example that only certain groups will be able to access PII, even if datasets are shared across the organization. This is often achieved by methods such as dynamic data masking.
- Having a “council” (or team-of-teams) that can make quick decisions about limitations to data sharing and resolve conflicts.
- Training for all data stakeholders (in many data-driven organizations that can mean substantial parts of the organization) about data privacy, data security, and data governance.
The DataSecOps Approach
The move to a more open data sharing policy within the organization, aligns very much with the principles of Data Security Operations or DataSecOps, as outlined here
. In other words, for an organization to effectively be able to be in such a liberal position, they need to adopt the principles of DataSecOps. For example, security needs to be bolted into the process itself, as otherwise, it will be impossible to apply policies in an automated manner to get instantaneous data access.
Needless to say, the first part of the journey, moving from a “default to know” to a “need to know” state of mind is also happening with less friction and risks when adopting the DataSecOps principles.
Is a “default to share” (or “need to share”) the right way for all organizations to treat their data? No. This approach might be considered as “pure data democratization” and is suitable for organizations that are mature in their DataSecOps or in the process of getting to a mature level. It is always hard to move away from being risk-averse, and it only works if the risk is mitigated accordingly.
However, an organization that does effectively and with a low-risk level become a “default to know” organization, may have an “unfair advantage” against risk-averse organizations in a data-first economy.
What is also important is to have discussions with the data stakeholders in the organization, while conscious about the current state of the organization and where you’d like to be headed in the future. The answer may be a journey, first moving to “need to know” and then adopting a “need to share” approach.
How Satori Helps
Satori, the DataSecOps platform, allows you to be more agile about your data access, whether you go “full default to share”, enable “just in time” access control, or implement an efficient “need to know” access control, across all your data stores. For more information, read how Satori helps streamline access to data
and accelerate value from data
. Or simply schedule a demo meeting by filling out the form below: