Improve Snowflake data governance through automated descriptions of database objects for tables, views, and columns
Photo by Claudio Schwarz on Unsplash
Maintaining thorough and accurate documentation of database objects is crucial for effective data management and collaboration. Snowflake's Snowsight interface has introduced an innovative feature powered by Snowflake Cortex, which automates the generation of descriptions for tables, views, and columns. This functionality leverages large language models (LLMs) to evaluate object metadata and, when needed, sample data, allowing for the creation of informative descriptions that streamline the documentation process.
Understanding Snowflake Cortex-Powered Descriptions
Cortex's descriptions utilize LLMs on Snowflake to examine the metadata of database objects and generate insightful descriptions. These insights are stored in the object's comments property, allowing access through various interfaces, including:
- The Table Details and View Details tabs in Snowsight.
- The Columns tab for tables or views in Snowsight.
- The output of the DESCRIBE TABLE command.
- The Account Usage TABLES view.
Once saved, any user with appropriate privileges can view these descriptions, enhancing data transparency and understanding.
It supports Native tables, Dynamic tables, Hybrid tables, Apache Iceberg™ tables, and External tables.
Prerequisites and Access Control
You use the features, ensuring that the following conditions are met:
Access Control Privileges
- Ownership (OWNERSHIP privilege) of the table or view.
- Usage (USAGE privilege) on a virtual warehouse.
- Assignment of the SNOWFLAKE.CORTEX_USER database role.
This is how you can grant these privileges:
USE ROLE ACCOUNTADMIN;
GRANT USAGE ON WAREHOUSE my_warehouse TO ROLE governance_admin;
GRANT DATABASE ROLE SNOWFLAKE.CORTEX_USER TO ROLE governance_admin;
Replace my_warehouse with your warehouse name and governance_admin with the appropriate role.
Regional Support
The availability of the feature relies on the regional support for the LLM utilized by Snowflake Cortex. If it is unavailable, consider enabling cross-region inference.
Generating Descriptions Using Snowsight
Follow these steps to generate descriptions for your database objects:
- Navigate to the Desired Object : Locate the table or view for which you want to generate descriptions. You may be prompted to select a Warehouse if one is already selected.
Generate Descriptions
Follow along on how easy it is to generate descriptions per category, tables, views, or columns.
For Tables and Views :
- Go to the Table Details or View Details tab.
- Click on “Generate with Cortex.”
- Review and edit the generated description if necessary.
- Click “Save” to store the description.
For All Columns at Once :
- Navigate to the Columns tab of the table or view.
- Click on “Generate Descriptions” in the toolbar.
- Decide whether to use sample data when prompted.
- Review, edit, and select the columns for which to save descriptions.
- Click “Save” to apply the descriptions.
For a Single Column :
- In the Columns tab, hover over the desired column’s row in the Description column.
- Click on “Generate with Cortex.”
- Decide on using sample data if prompted.
- Review and edit the description as needed.
- Click “Save” to store the description.
Using Sample Data for Enhanced Descriptions
When generating descriptions for columns, you may be prompted with the option to include sample data to improve accuracy. Your choice is stored for the duration of the session and can be adjusted in your User Profile settings.
Overwriting Existing Descriptions
You can overwrite existing descriptions by clicking on the same buttons “Generate with Cortex”.
Cost Considerations
Generating descriptions incurs the following costs:
- Warehouse Compute Credits : Charges for the virtual warehouse used during the operation.
- Snowflake Cortex Usage : Credits consumed by the LLMs, billed under AI-Services, encompassing all uses of Snowflake Cortex.
Conclusion
This is a super simple, powerful feature that allows you to quickly add descriptions. You just need to review and approve the auto-generated text. This would empower data catalogs and BI tools with descriptions flowing from the data warehouse.
It is yet another use of Snowflake Cortex within Snowsight to enhance documentation in a very efficient way. While it would be ideal to generate them for the entire account, that approach is prone to errors, so it is always better to go one by one.
I am Augusto Rosa, a Snowflake Data Superhero and Snowflake SME. I am also the Head of Data, Cloud, & Security Architecture at Archetype Consulting. You can follow me on LinkedIn.
Subscribe to my Medium blog https://blog.augustorosa.com for the most interesting Data Engineering and Snowflake news.