There are two main schools of thought regarding organizational data storage. The first is a traditional centralized approach – data lakes and warehouses that catalog information in one convenient place, scaling with the size and needs of the organization they serve. Everyone in the organization interacts with the same massive database with savvy administrators to oversee them. While this has been a useful, well-known structure, it’s beginning to show its age.
The second, more decentralized approach is the data mesh. A data mesh can store the same amounts of information a centralized data lake or warehouse can, but this data is spread out into separate repositories defined by the teams using them. These repositories are then linked together through cloud platforms so that data can still be shared across departments and teams. A federated structure like this has various benefits over its centralized predecessor.
In this article, we’ll explore the definition of data mesh and its role in data ownership and governance, some common data mesh use cases, and the benefits of using data mesh architecture for your business.
What is a Data Mesh?
As the name implies, a data mesh is a decentralized group of smaller databases that are still linked together (i.e. enmeshed) but are specialized based on their department and use. They can also be segmented for specific functions, such as pulling data from other repositories across the mesh.
For example, instead of every team calling on the same centralized database overseen by agnostic administrators, each repository within the data mesh has specialized owners who understand their data well. This enables more accurate, useful data collection and storage since those data owners know what makes the data so valuable for their team.
No matter the skill of a database administrator overseeing a data lake, the scope of these structures is so big that their tasks remain general – in other words, their main priority is to keep the data lake functioning rather than finding ways to make the data useful across dozens of teams with hundreds or thousands of users.
It’s important to note that the term “data mesh” refers to two similar though separate concepts:
- “A” or “the” data mesh refers to the decentralized storage architecture described above.
- Data mesh refers to the strategy and process of creating and governing this storage structure.
The goal of data mesh is that it, in its decentralization, becomes a sort of self-serve data platform. In other words, the architecture is set up so that each team can tailor their repositories to what they need while sharing data with different teams easily and securely.
What about Data Ownership and Data Governance?
Without data ownership and governance, any database can quickly descend into anarchy. Data mesh is one way to implement both easier while making the whole database more efficient.
Data ownership can refer to two concepts:
- Consumer data relating to privacy laws such as CCPA and GDPR.
- Defining organization members responsible for managing data, securing it, and implementing strategies to best use it.
While the first point is important, we’ll use the second definition for context throughout this article.
By comparison, where data ownership defines individuals and their roles and responsibilities when interacting with data, data governance is a set of processes, policies, structures, and procedures to keep the database working.
Data governance is an umbrella that all other pieces – security, ownership, stewardship, quality, etc. – fall under. Governance defines ownership; the owners enforce and shape governance.
Ownership and Governance in Data Mesh
A federated form of data ownership and governance is one of data mesh’s benefits. Instead of the one-size-fits-all model that comes with data lakes, data owners for each repository within the mesh help define how best to collect, use, secure, and maintain data. While governance still provides standards across the entire mesh, its flexibility enables teams to work within those standards to make their share of the data as valuable as possible.
Because of this flexibility and accessibility, each repository and its purposes can be customized to fit the needs of the team that uses it by pulling information from other parts of the data mesh. This enables teams to have much better views of data that are not only more relevant to their work, but higher quality due to vetting by expert data owners.
Data Mesh Use Cases
Here are a few ways data mesh can be implemented:
- 360º Customer Views: by pulling data from various sources, these views can generate customer service insight, including how to improve customer satisfaction and complaint resolution, reduce average handling times, and develop new marketing strategies to increase conversion rates and up/cross-sale effectiveness.
- Internet of Things (IoT) analytics: product teams can find patterns for device use and consumer habits, common error-catching and troubleshooting (as well as documentation to speed up future fixes), and determine ways to improve future product versions.
- Hyper segmentation: marketing teams can generate lead and customer views for specific audiences and demographics to pair them with various solutions and sales strategies and collect data demonstrating how effective this information is for future campaign planning. Whether customer-facing or internal, analytics can become as granular as you need them to be.
Benefits of Data Mesh
Using a data mesh has many advantages vs. a centralized data warehouse. The following are a few examples:
Distributed Security
While a centralized data lake is useful for keeping a company’s data in one convenient location, it’s not the safest configuration against hacking. A data mesh distributes risk and data; if one repository is compromised, fail safes can isolate the threat without compromising other parts of the mesh. This can be done with various tools such as encryption (both at rest and in motion), distributed authentication, cross-platform IAM, and deterministic masking to name a few.
Clear Understanding of Security Architecture, Processes, and Actors
As mentioned, data mesh allows specialized data ownership for each repository while maintaining flexible standards throughout the database. The larger a database grows, the more difficult it becomes to manage effectively or ensure data value.
Federated repositories within a data mesh allow an organization to assign smaller teams and specialized data owners to manage different domains, defining who does what, where, why, and how. Answering these questions makes security architecture useful and can be developed better rather than tossing everything into a one-size-fits-all bin.
Handle Data Lakes More Effectively
Similarly, a data mesh’s segmentation helps standardize data, simplify pipelines, encourage data hygiene, and spread risk and resources more effectively.
Reduce Time and Resources Managing Data Access
These benefits include reducing time and resources when managing data and its access controls. Given a mesh’s segmented nature, data access becomes a natural process. For example, the principle of least privilege is enforced along segment lines rather than having to draw them up for each user, updating them frequently, and taking up precious time trying to keep track of everyone within a centralized data lake.
How a Data Mesh Fits Within a Data Security Platform
Satori’s Data Security Platform provides automated data access control that protects sensitive data. Satori’s self-service and just-in-time data access provide the flexibility to implement access control across your databases, data lakes, and data warehouses.
To learn more about building a data mesh with Satori book a 30-minute consulting call with one of our experts.