We all like having our cake and eating it too. In this article, we will discuss a changing paradigm in data sharing across organizations. This shift is due to the realization that in most cases your ability to make valuable use of data is affected by the level of risk-adverseness in the organization. We will start by discussing the key benefits of a healthy data-sharing mindset in the organization. Then we will discuss why the traditional approach to data governance may not get you to a great data sharing ecosystem. It is important to mention that this approach may appeal to some organizations, while others may find other approaches more effective, depending on the types of data they deal with, security and compliance requirements, and data use. However, it is going to affect many data-driven organizations. According to Gartner, 2022 Strategic Roadmap for Data Security Platform Convergence, 28, September 2021, by 2025 30% of Gartner clients will protect their data using a “need to share” approach rather than the traditional “need to know” approach.
Key Benefits Of Great Data SharingTo keep this simple, great data sharing will be defined as data that is available quickly and easily to a lot of users so they can analyze it and use it to improve the business outcomes for functions including customer service, retention, support, operations, marketing, and sales. This definition is taken from the concept of data democratization, to enable as many people as necessary within the organization to access the data and turn it into meaningful business value. In general, these are the main benefits of having great data sharing in your organization.
- When data is used by many data consumers in your organization, who are able to quickly turn it into value, you get a high ROI on the data. Your high ROI is also on your data infrastructure and data services investment.
- When you have great data sharing, your time-to-market and time-to-value of data products and projects are reduced. Depending on the industry, the data available, and the results of the data processing, this can have a significant impact on your business results.
- When your data sharing works well, you reduce bottlenecks and “waiting lines” for data. This makes your users such as data scientists, business analysts, and engineers happier and allows teams like data engineering and data governance to become more focused on their core responsibilities as opposed to manual administration and processing of data access requests.
Traditional Data GovernanceData governance is a broad term that refers to the organizational policies and procedures that govern data management. Data governance guarantees that data is safe and secure. It also ensures that the data provided is protected, reliable, documented, controlled, and evaluated. Traditionally, organizations lean towards an “opt-in” approach when it comes to data sharing. This means that a data owner by default does not share the data with the rest of the organization. When requested specifically, in most cases, the data owner assesses the benefits of sharing the data versus the risks associated with sharing the data. Benefits are mostly different types of business outcomes such as improved marketing, customer care, or operations, while risks are mostly compliance and security risks (such as not meeting regulatory requirements, or data exposure). This process takes time. In addition, from the time the data owner approves access to the shared dataset, it often takes additional time until this is technically enabled by the data teams such as data engineering, platform, and shared data services for sharing with the data consumers - data scientists, business analysts, engineers, etc.
From “Default To Know” to “Need To Know”Many companies, especially when they’re in hyper-growth mode, are in a data access mode which can be defined as “default to know”. This means that data is accessible in an over-permissive way, and sometimes in an uncontrolled way. There are, of course, different shades of “default to know”, but this situation often incurs hidden costs in terms of security and compliance risks. There comes a time when companies want to close such gaps. Closing such a gap is a transition that is often not easy to perform. New security controls have to be placed, new processes should be created, and new security policies should be applied. Users that were used to having free access to all of the data get limited access based on their role and responsibility. These processes can be painful in many ways, such as -
- Changing security controls in production and analytics environments carries operational risks and costs.
- Such projects and processes should be understood, accepted, and executed by many different teams each with different goals and objectives.
- It is often difficult to get buy-in for such processes, especially when the business value is not always growth, at least not in a direct way. This depends on the situation, as in many cases such companies will reach a growth cap if they don’t meet certain data access control requirements.
From “Need To Know” To “Need To Share”The move we’re seeing happening, and expect to see more of in the future is organizations that are already in a “need to know” state of mind, and now realize that the focus should be on allowing more data accessibility in the organization, and be in a share-first mentality. For organizations to reap the benefits of fast and effective data sharing – high ROI on data, reduced time-to-value, and organizational efficiency – a change in mindset and behavior is required. Here are the recommendations based on industry leaders who have balanced the risks and rewards of data sharing:
From Risk-Averse To Risk-AdjustedThe first step is to understand that instead of looking at how to mitigate all risks, then accept data sharing requests, the organization should begin from a position where data needs to be shared and proceed to mitigate the risks. A good example of this is having a continuous anonymization layer that makes sure that data that is shared is anonymized or masked to the required extent. It’s important to have masking that is role-based so that the customer success team can access the customer’s date of birth - month and date - so they can wish them Happy Birthday but do not have access to the year for the date of birth field, while other users can’t access any part of the date of birth as it’s not needed for their jobs.
From Opt-In To Opt-Out Data SharingThis means that the default of the data in the organization is to be shared. This is often called “Open By Default”. This does not mean that organizations should give data up, but only that the default should be that data is shared. Data owners, as well as other data stakeholders (data governance, data security, data privacy, and such teams), can and should place limitations on the data to be shared especially sensitive data. However, these should be the exceptions to the default. In other words, data owners should opt out of sharing the data, rather than opt-in to share it. This is unlike the common situation where data is siloed unless specified otherwise.
Taking (Some) Power Away From Data OwnersTraditionally, data owners are getting share requests from other teams in the organization, and decide, sometimes after consulting with others, whether to share the data or not. They mostly make this decision based on a calculation of risk vs value. The problem with this approach is that data owners, as well as data creators, often have a narrower point of view about the risks involved with sharing the data, but most importantly with the value derived from sharing the data. Data owners are often biased in their views about the risks and values. Organizations need to take away some of the power of data owners. By making them specifically request specific datasets not to be shared, they are forced to have a “default to know” state of mind.
Clear & Transparent Security, Governance & Privacy PoliciesAn organization that uses “default to know” access controls can’t exist without taking necessary steps to keep risk at bay even when data is shared. This is done by making the “rules of engagement” around data sharing very clear. Some of the expectations include:
- Knowing where sensitive data is, on a continuous basis. Otherwise, the data exposure risk may override the want to democratize data in the organization.
- The ability to have agile access control. This means for example that only certain groups will be able to access PII, even if datasets are shared across the organization. This is often achieved by methods such as dynamic data masking.
- Having a “council” (or team-of-teams) that can make quick decisions about limitations to data sharing and resolve conflicts.
- Training for all data stakeholders (in many data-driven organizations that can mean substantial parts of the organization) about data privacy, data security, and data governance.