Hardening snowflake security: access controls explained
Update - March 3, 2021
I’ve modified several sections, added some clarifications, as well as referencing to new Snowflake capabilities and links to more specific guides.
Snowflake is one of the world’s premier data warehouses and leading SaaS companies in the field of storage. For the many that use Snowflake to store sensitive information, it’s paramount to engage with it according to defined security guidelines in order to detect risky events, prevent as many threats as possible and react to security incidents in the best way possible.
In the following post, I’m going to provide an overview of Snowflake Security and its features as well as a practical guide for using them to their full potential. If interested, I also wrote a similar Redshift security guide and a BigQuery security guide.
Here’s what we’re going to cover:
Network access control in Snowflake DB
Snowflake provides out-of-the-box network access control via “network policies”, allowing users to restrict account access to specific IP addresses. The level of granularity can be account-level and user-specific (bear in mind that user settings take precedence when you assign policies for both). You can set lists of IPs or subnets, using CIDR (for example 126.96.36.199/24 to add the entire Class C of 188.8.131.52-184.108.40.206) to be allowed according to a network policy, as well as IPs to be blocked.
Network policies per account or per user can either be set from the Web UI or by running SQL commands. In the following example, Ben, our data security engineer, wants to set a new network policy to grant U.S.-based employees access to Snowflake from their office, excluding the guest network:
/* Creating a network policy for employees */ CREATE OR REPLACE NETWORK POLICY us_employees ALLOWED_IP_LIST = ( '220.127.116.11/24', '18.104.22.168/24', '22.214.171.124' ) BLOCKED_IP_LIST = ( '126.96.36.199', '188.8.131.52' ) COMMENT = 'US employees offices, excluding guest WiFi gateways'; /* We now have the new network policy */ SHOW NETWORK POLICIES; /* And here's what the network policy does: */ DESCRIBE NETWORK POLICY US_EMPLOYEES; /* Assigning it to a user */ ALTER user us_analyst_1 SET NETWORK_POLICY=US_EMPLOYEES; SHOW users;
Since Snowflake’s granularity can apply to either the whole account or for specific users, an onboarding script can be used to automatically apply relevant policies to users. Note that network policies currently only support IPv4 addresses (meaning that IPv6 is currently excluded).
Managing access across users, roles and groups
While managing access for enterprises often begins as a simple process, it can quickly escalate in complexity as employees move from project to project, team to team and position to position. Therefore, managing access at scale can develop into an arduous and time-consuming task for those responsible for carrying out these changes (which can mean security teams, IT, data engineering or other teams, depending on the organization).
Ideally, a data warehouse should have its data utilized by many teams in the organization in order to drive business and innovation forward. However, usage must always strike the delicate balance between operation and risk. Granting wide access to many teams can raise costs, use up valuable time and pose high risk, especially when third parties are added into the mix. The more data is exposed, the greater the possibility of undesirable outcomes like data breaches, data leaks and exposure of more people to sensitive data. This conflict sits at the heart of data policies.
Let’s start with some useful definitions:
- What is a USER in Snowflake DB?
A user is an identity entity, either a person or an application, and can be either defined directly in Snowflake or inherited from an identity management solution (such as Okta).
- What is a SECURABLE OBJECT in Snowflake DB?
A securable object is any entity (such as a table, view, warehouse, schema or DB) to which you would grant (or deny by not granting) access to specific roles. They are owned by the user that created them.
- What is a PRIVILEGE in Snowflake DB?
A privilege is a level of access to a securable object (for example: the privilege of reading from a certain table. Multiple privileges can be granted to a securable object such as the privileges of writing or deleting from a table.
- What is a ROLE in Snowflake DB?A role is an entity to which you can define privileges (for example: grant a role of SUPPORT_GEN with read access from the table `known_issues`). You can then either assign users those privileges, or assign these roles to other roles (which creates a hierarchy of roles).
Sample of roles assignment:
In our example, we deal with the order management aspect of Awesominator Inc. There are three different business units that receive orders. Business units 1 & 2 are handled by one shipping operation (Shipping A) and orders for unit 3 are handled by another (Shipping B).
Orders are inserted by an order entry team, which require write-only access to the orders tables. Each shipping operation needs a read-only role to read the orders, while the global warehouse team needs read access across all business unit tables. At the same time, a support team needs to have both the writing permissions of the order entry team and update permissions as well.
Even in such a (relatively) simple scenario, we’re very happy to have Hierarchical RBAC, and roles are planned according to the diagram:
In the diagram above, the securable objects are the tables (orders_bu_1, orders_bu_2, orders_bu_3). The ORDER_ENTRY role has INSERT privileges over all orders tables, and on a higher hierarchy, SUPPORT_1 can do that, as well as update the tables of business units 1 & 2.
On the right side, each shipping role has SELECT privileges on the relevant tables, while the SHIPPING_GLOBAL role has access to all warehouses.
The users can either be assigned manually to the different roles, or more realistically defined in an identity management system. As an example, the COO of the company may have both SHIPPING_GLOBAL & SUPPORT_1 roles, meaning that they can do anything both roles can do (though for each query, the COO will need to select the appropriate role).
Note that Snowflake does not support the notion of groups internally, and instead relies on an integration with an identity provider (IDP) using SCIM (System for Cross-domain Identity Management). By using the SCIM 2.0 API, assigning an IDP group to the Snowflake application in the Identity Provider will create a role in Snowflake that includes the users that are under that group in the IDP.
Snowflake offers great flexibility around access control management, as it offers a combination of DAC (Discretionary Access Control) & RBAC (Role Based Access Control). That means that you can set privileges to specific roles (which you can then assign to users or to other roles, by creating a hierarchy). Users also have privileges as creators (owners) of objects.
Whenever a user opens a Snowflake session (whether through the UI, CLI, a tool, or an application using a connector or a driver), it gets assigned with a role according to the following configuration:
If the connection has the parameter role defined in the JDBC/ODBC connection, this will be the default role used (assuming the user was actually granted this role).
Otherwise, if a default role is configured for the user, it will be set.
If both do not exist, the user will be assigned to the PUBLIC role.
During the course of a session, users can change the current used role by using the following command:
USE ROLE <rolename>;
Alternatively, in the UI, they can also change the current role from the definitions located in the top right corner of the UI:
How to set privileges in snowflake?
Finally, assigning our structure is done in SQL commands, as shown below:
CREATE DATABASE IF NOT EXISTS Awesomeinator; USE Awesomeinator; // create the securable objects (tables). In reality we will have slightly more columns. CREATE TABLE orders_bu_1 (order_id number); CREATE TABLE orders_bu_2 (order_id number); CREATE TABLE orders_bu_3 (order_id number); // Creating the basic roles: CREATE ROLE ORDER_ENTRY; CREATE ROLE SHIPPING_A; CREATE ROLE SHIPPING_B; // Creating the higher hierarchy roles: CREATE ROLE SUPPORT_1; CREATE ROLE SHIPPING_GLOBAL; // Granting privileges for the different roles: GRANT INSERT ON ALL TABLES IN DATABASE Awesomeinator TO ROLE ORDER_ENTRY; GRANT SELECT ON orders_bu_1 TO ROLE SHIPPING_A; GRANT SELECT ON orders_bu_2 TO ROLE SHIPPING_A; GRANT SELECT ON orders_bu_3 TO ROLE SHIPPING_B; // Adding hierarchy in our shipping: GRANT ROLE SHIPPING_A to ROLE SHIPPING_GLOBAL; GRANT ROLE SHIPPINGE_B to ROLE SHIPPING_GLOBAL; // Finally, adding the almighty support role (Anything order entry can do and then some): GRANT ROLE ORDER_ENTRY TO ROLE SUPPORT_1; GRANT UPDATE ON ALL TABLES IN DATABASE Awesomeinator TO ROLE SUPPORT_1;
For more information, refer to our specific posts about managing Snowflake roles at scale, and about simplifying Snowflake roles management with Satori.
Fine-grained snowflake DB access control
Within the regular permissions model of Snowflake, the most granular securable object is a table (or a view). Within the model itself, there is no column-based access control or row-based access control.
More granular access control can be enforced by using secure views. Note that using secure views has a performance impact on your queries, as it eliminates some of the optimizations done in access to other securable objects (such as tables or “normal” views). In addition, using secure views to enforce finer-grained access control can be difficult to create and maintain for complex environments.
What are Snowflake secure views anyway?
Secure views add two security features over “normal” views:
Users who do not own the secure views can’t view its definition via the SHOW VIEW command.
Secure views do not use optimizations, to prevent exposing the underlying data which may expose the underlying data which can give users answers about data outside the view’s restrictions.
How to apply Snowflake DB secure views to enforce column level access control
Let’s get back to Awesominator Inc., where the privacy officer wants to limit order details to those relevant to the shipping users, disallowing them from seeing the sensitive `id_num` column:
// We create a secure view to show only relevant details, without the sensitive information: CREATE SECURE VIEW view_orders_bu_1 AS (SELECT order_id, customer_name, address FROM orders_bu_1); CREATE SECURE VIEW view_orders_bu_2 AS (SELECT order_id, customer_name, address FROM orders_bu_2); CREATE SECURE VIEW view_orders_bu_3 AS (SELECT order_id, customer_name, address FROM orders_bu_3); // Checking what privileges we have: SHOW GRANTS TO ROLE SHIPPING_A; // Revoking the privileges directly to the tables REVOKE SELECT ON ALL TABLES IN DATABASE awesomeinator FROM ROLE SHIPPING_A; REVOKE SELECT ON ALL TABLES IN DATABASE awesomeinator FROM ROLE SHIPPING_B; // Checking what privileges we have now: SHOW GRANTS TO ROLE SHIPPING_A; // Finally, giving privileges to the views, blocking access from the sensitive information GRANT SELECT ON view_orders_bu_1 TO ROLE SHIPPING_A; GRANT SELECT ON view_orders_bu_2 TO ROLE SHIPPING_A; GRANT SELECT ON view_orders_bu_3 TO ROLE SHIPPING_B; // Checking what privileges we have now: SHOW GRANTS TO ROLE SHIPPING_A; SHOW GRANTS TO ROLE SHIPPING_B; // Note we do not need to do anything with the global shipping account due to the role hierarchy SHOW GRANTS TO ROLE SHIPPING_GLOBAL;
A similar approach can also limit access dynamically by using the CURRENT_ROLE() function in the view query to create more complicated fine-grained access control.
You can also use Snowflake to apply dynamic masking, either with it’s dynamic masking feature (available in enterprise and business critical plans), or configure it manually on your own. For more informaiton, refer to our guide about the different Snowflake data maksing types.
Access logging & monitoring in Snowflake DB
When maintaining an effective security environment, it’s important to not only prevent threats, but also be aware of them; Even if you’re blocking a certain problem, you may still want to take additional measures.
You will want to know when there are unsuccessful attempts to establish a Snowflake session, as they may indicate either a misconfiguration in one of your applications (which may cause unpredictable results) or a malicious attack.
How do you get all failed snowflake logins (failed connection sessions)?
WITH access_attempts AS ( SELECT * FROM TABLE(INFORMATION_SCHEMA.LOGIN_HISTORY(result_limit=>10000)) ORDER BY event_timestamp) SELECT * FROM access_attempts WHERE IS_SUCCESS='NO';
You will also want to follow successful Snowflake sessions, to be alerted to anomalies (such as successful login attempts from unexpected IP addresses or tools). Such anomalies may indicate excessive privilege abuse or legitimate privilege abuse vulnerabilities, which are core issues in database security. As an example, if there is a Snowflake entity (user or role) being used by an application, it would make sense to monitor cases when the connections to it are carried out by human users.
Note that the access attempts dynamic table only shows results from the last 7 days and is limited to retrieving up to 10,000 results. Therefore, it is important to create a job that pulls relevant information from LOGIN_HISTORY and push it elsewhere (for example: failed login attempts can be sent to a SIEM and all login history can be pushed to analytics storage, perhaps even a table on Snowflake itself). Depending on the scale, you may want to perform aggregations to the data, as it can accumulate to quite a lot.
Privilege violation logging & monitoring in Snowflake DB
Failed query attempts (for example: when a user tries to access a forbidden securable object) can be retrieved by the QUERY_HISTORY or QUERY_HISTORY_BY_USER commands. You can then filter the results by the ERROR_CODE or ERROR_MESSAGE columns. More on query logs here. Note that if you want to verify which users are accessing a specific resource you would like to restrict, you will need to parse the SQL Queries carried out to determine if those certain objects were accessed.
How can you log access violations in snowflake DB?
// change test to your user GRANT ROLE ORDER_ENTRY TO USER test; USE ROLE ORDER_ENTRY; SELECT * FROM orders_bu_1; // Getting the expected error: // SQL compilation error: Object 'ORDERS_BU_1' does not exist or not authorized. // Checking a different, non-existent object: SELECT * FROM non_existent_object; // Getting the same error: // SQL compilation error: Object 'ORDERS_BU_1' does not exist or not authorized. // Getting back to our admin role: USE ROLE ACCOUNTADMIN; // Pulling the latest queries in history: SELECT * FROM TABLE(INFORMATION_SCHEMA.QUERY_HISTORY()) ORDER BY start_time DESC LIMIT 100;
As you can see in this example, the query history will provide the same log entry when a user tries to access a non-existent object or tries to access an object they don’t have access to. This means that you will need to customize the results to your needs and consider how to handle aggregation, filtering and storage, given that query logs are also limited to retrieving up to 10,000 results and the last 7 days.
Note that failing to audit these results may lead to a weak audit trail vulnerability, which can make it harder to react to threats down the line.
Identity management integration for federation
Snowflake enables identity management using either users defined within Snowflake, or users provisioneddefined by external identity providers such as Okta and Ping Identity, either with SCIM integration, or by SQL queries integration. This allows provisioning of users and rolesfederation of user management, rather than managing them within Snowflake itself, enabling the required roles and applying RBAC over those roles.
The approach of provisioning users is recommended over setting users directly in Snowflake, as it is a more streamlined process for managing user processes (such as onboarding, offboarding and changes in roles). This reduces the risks of having over-privileged users and forgetting to remove inactive users.
In addition, it is possible (and recommended) to manage authentication to Snowflake by IdP’s such as Okta, Azure AD or other SAML 2.0 supporting identity providers. This enables setting Single Sign-On (SSO) for the users, and is recommended, as it provides the same security policies as used in other service providers in the organization, as well as using your own Multi-Factor Authentication (MFA).
Snowflake provides excellent tools to help users engage with it in a secure way. Nonetheless, there are a number of instances where it is important to use an overlay for additional access controls and security visibility, and this is one of the reasons we're building Satori:
Satori and network access control
Satori can help you achieve significantly more granular network-based access control when deployed in front of a Snowflake environment. For example, using Satori can allow global access to your Snowflake data warehouse while prohibiting access to specific resources (databases, schemas, tables, rows or columns) outside of a certain set of IP addresses.
Satori and fine-grained access control of Snowflake DB:
When deploying Satori in front of a Snowflake environment, you can create column, row and cell-based granular access controls tied to business rules instead of securable objects. This enables easy maintenance of such policies and a gradual roll-out process (beginning with alert-only). You can also achieve easy visibility into the existing data access patterns and violations on the Satori dashboard or through Satori’s listing in the Snowflake Marketplace.
Satori and keeping track of access violations
Satori keeps track of logins and sessions to Satori-protected Snowflake environments, enabling security policies to handle such violations as well as enhanced visibility into suspicious incidents. You can also view all your database activities from different datastores via Satori’s Audit.
Satori and handling data protection of Snowflake DB permission issues:
By using Satori, access control violations (whether in “alert” mode or blocked) are logged infinitely, and security teams can gain full visibility into the violations to take appropriate action.
I hope you enjoyed reading this as much as I enjoyed writing it. If you encounter errors or think of parameters I neglected to review, please feel free to reach out to me on twitter (@kernelXSS) or by email.
Recent blog posts
- Democratize Data in AWS Redshift With Self-Service Data Access Workflows
- Zero to Self-Service Snowflake Data Access Management
- Satori Is Launching Self-Service Data Access
- Snowflake Security: Best Practices for Stages
- What it Means to be an RSA Conference Innovation Sandbox Finalist
- What is DataSecOps?
Posts by Tag
- Data Governance
- Access Control
- Data Protection
- Snowflake Data Warehouse
- data security
- AWS Redshift
- data democratisation
- Data Science
- Sensitive Data
- Snowflake security
- self service access control
- Data Masking
- Human Element
- Least Privileges
- Policy Engine
- RSA ISB
- Redshift Security
- Redshift data access
- Row Level Security
- Snowflake Roles
- role hierarchy
- rsa conference
- rsa innovation sandbox
- snowflake stages