This repository enables users to check if assets stored within Databricks Unity Catalog are stored within a specific location in cloud object storage. The list of assets supported include:
- Managed Catalogs
- Managed Schemas
- Managed Tables
- External Tables
Tip: To generically export the storage locations for all assets, set external_location = ""
This gets all Managed Catalogs that have external_location
as a part of their root path.
To run this function succesfully, the user running the function needs to have USE CATALOG
permissions in UC for the catalogs being checked, and SELECT
permissions to the system.information_schema.catalogs
.
external_location = "abfss"
get_managed_catalogs(external_location)
This gets all Managed Schemas that have external_location
as a part of their root path.
To run this function succesfully, the user running the function needs to have USE SCHEMA
permissions in UC for the schemas being checked, and SELECT
permissions to the system.information_schema.schemata
.
To check Managed
Schemas in a given Catalog:
get_managed_schemas(external_loc=external_location, catalog="gshen_catalog")
To check Managed
Schemas across all catalogs:
get_managed_schemas(external_loc=external_location)
If you do not have access to system.information_schema.schemata
you can use the system_table
parameter to switch to a information_schema
the user does have access to:
get_table_paths(external_loc=external_location, catalog="gshen_catalog",system_table = "gshen_catalog.information_schema.schemata")
This gets all Managed and External tables that have external_location
as a part of their root path.
To run this function succesfully, the user running the function needs to have SELECT
permissions in UC for the tables being checked, and SELECT
permissions to the system.information_schema.tables
.
To check External
Tables in a given catalog:
get_table_paths(external_loc=external_location, catalog="gshen_catalog", table_type="EXTERNAL")
To check Managed
Tables in a given catalog and schema:
get_table_paths(external_loc=external_location, catalog="gshen_catalog", schema ="data_blending", table_type="MANAGED")
To check External
and Managed
Tables across all catalogs and schemas :
get_table_paths(external_loc=external_location)
If you do not have access to `system.information_schema.tables`, you can use the `system_table` parameter to switch to a `information_schema` the user does have access to:
get_table_paths(external_loc=external_location, catalog="gshen_catalog",system_table = "gshen_catalog.information_schema.tables")