diff --git a/docs/command-line-interface.md b/docs/command-line-interface.md new file mode 100644 index 0000000000..a7ffaf5962 --- /dev/null +++ b/docs/command-line-interface.md @@ -0,0 +1,1067 @@ +# Polaris CLI + +In order to help administrators quickly set up and manage their Polaris server, Polaris provides a simple command-line interface (CLI) for common tasks. + +The basic syntax of the Polaris CLI is outlined below: + +``` +polaris [options] COMMAND ... + +options: +--host +--port +--client-id +--client-secret +``` + +`COMMAND` must be one of the following: +1. catalogs +2. principals +3. principal-roles +4. catalog-roles +5. namespaces +6. privileges + +Each _command_ supports several _subcommands_, and some _subcommands_ have _actions_ that come after the subcommand in turn. Finally, _arguments_ follow to form a full invocation. Within a set of named arguments at the end of an invocation ordering is generally not important. Many invocations also have a required positional argument of the type that the _command_ refers to. Again, the ordering of this positional argument relative to named arguments is not important. + +Some example full invocations: + +``` +polaris principals list +polaris catalogs delete some_catalog_name +polaris catalogs update --property foo=bar some_other_catalog +polaris catalogs update another_catalog --property k=v +polaris privileges namespace grant --namespace some.schema --catalog fourth_catalog --catalog-role some_catalog_role TABLE_READ_DATA +``` + +### Authentication + +As outlined above, the Polaris CLI may take credentials using the `--client-id` and `--client-secret` options. For example: + +``` +polaris --client-id 4b5ed1ca908c3cc2 --client-secret 07ea8e4edefb9a9e57c247e8d1a4f51c principals ... +``` + +If `--client-id` and `--client-secret` are not provided, the Polaris CLI will try to read the client ID and client secret from environment variables called `CLIENT_ID` and `CLIENT_SECRET` respectively. If these flags are not provided and the environment variables are not set, the CLI will fail. + +If the `--host` and `--port` options are not provided, the CLI will default to communicating with `localhost:8181`. + +### PATH + +These examples assume the Polaris CLI is on the PATH and so can be invoked just by the command `polaris`. You can add the CLI to your PATH environment variable with a command like the following: + +``` +export PATH="~/polaris:$PATH" +``` + +Alternatively, you can run the CLI by providing a path to it, such as with the following invocation: + +``` +~/polaris principals list +``` + +## Commands + +Each of the commands `catalogs`, `principals`, `principal-roles`, `catalog-roles`, and `privileges` is used to manage a different type of entity within Polaris. + +To find details on the options that can be provided to a particular command or subcommand ad-hoc, you may wish to use the `--help` flag. For example: + +``` +polaris catalogs --help +polaris principals create --help +``` + +### catalogs + +The `catalogs` command is used to create, discover, and otherwise manage catalogs within Polaris. + +`catalogs` supports the following subcommands: + +1. create +2. delete +3. get +4. list +5. update + +#### create + +The `create` subcommand is used to create a catalog. + +``` +input: polaris catalogs create --help +options: + create + Named arguments: + --type The type of catalog to create in [INTERNAL, EXTERNAL]. INTERNAL by default. + --storage-type (Required) The type of storage to use for the catalog + --default-base-location (Required) Default base location of the catalog + --allowed-location An allowed location for files tracked by the catalog. Multiple locations can be provided by specifying this option more than once. + --role-arn (Required for S3) A role ARN to use when connecting to S3 + --external-id (Only for S3) The external ID to use when connecting to S3 + --tenant-id (Required for Azure) A tenant ID to use when connecting to Azure Storage + --multi-tenant-app-name (Only for Azure) The app name to use when connecting to Azure Storage + --consent-url (Only for Azure) A consent URL granting permissions for the Azure Storage location + --service-account (Only for GCS) The service account to use when connecting to GCS + --remote-url (For external catalogs) The remote URL to use + --property A key/value pair such as: tag=value. Multiple can be provided by specifying this option more than once + Positional arguments: + catalog +``` + +##### Examples + +``` +polaris catalogs create \ + --storage-type s3 \ + --default-base-location s3://example-bucket/my_data \ + --role-arn ${ROLE_ARN} \ + my_catalog + +polaris catalogs create \ + --storage-type s3 \ + --default-base-location s3://example-bucket/my_other_data \ + --allowed-location s3://example-bucket/second_location \ + --allowed-location s3://other-bucket/third_location \ + --role-arn ${ROLE_ARN} \ + my_other_catalog +``` + +#### delete + +The `delete` subcommand is used to delete a catalog. + +``` +input: polaris catalogs delete --help +options: + delete + Positional arguments: + catalog +``` + +##### Examples + +``` +polaris catalogs delete some_catalog + +polaris catalogs delete another_catalog +``` + +#### get + +The `get` subcommand is used to retrieve details about a catalog. + +``` +input: polaris catalogs get --help +options: + get + Positional arguments: + catalog +``` + +##### Examples + +``` +polaris catalogs get some_catalog + +polaris catalogs get another_catalog +``` + +#### list + +The `list` subcommand is used to show details about all catalogs, or those that a certain principal role has access to. The principal used to perform this operation must have the `CATALOG_LIST` privilege. + +``` +input: polaris catalogs list --help +options: + list + Named arguments: + --principal-role The name of a principal role +``` + +##### Examples + +``` +polaris catalogs list + +polaris catalogs list --principal-role some_user +``` + +#### update + +The `update` subcommand is used to update a catalog. Currently, this command supports changing the properties of a catalog or updating its storage configuration. + +``` +input: polaris catalogs update --help +options: + update + Named arguments: + --default-base-location (Required) Default base location of the catalog + --allowed-location An allowed location for files tracked by the catalog. Multiple locations can be provided by specifying this option more than once. + --property A key/value pair such as: tag=value. Multiple can be provided by specifying this option more than once + Positional arguments: + catalog +``` + +##### Examples + +``` +polaris catalogs update --property tag=new_value my_catalog + +polaris catalogs update --default-base-location s3://new-bucket/my_data my_catalog +``` + +### Principals + +The `principals` command is used to manage principals within Polaris. + +`principals` supports the following subcommands: + +1. create +2. delete +3. get +4. list +5. rotate-credentials +6. update + +#### create + +The `create` subcommand is used to create a new principal. + +``` +input: polaris principals create --help +options: + create + Named arguments: + --type The type of principal to create in [SERVICE] + --property A key/value pair such as: tag=value. Multiple can be provided by specifying this option more than once + Positional arguments: + principal +``` + +##### Examples + +``` +polaris principals create some_user + +polaris principals create --client-id ${CLIENT_ID} --property admin=true some_admin_user +``` + +#### delete + +The `delete` subcommand is used to delete a principal. + +``` +input: polaris principals delete --help +options: + delete + Positional arguments: + principal +``` + +##### Examples + +``` +polaris principals delete some_user + +polaris principals delete some_admin_user +``` + +#### get + +The `get` subcommand retrieves details about a principal. + +``` +input: polaris principals get --help +options: + get + Positional arguments: + principal +``` + +##### Examples + +``` +polaris principals get some_user + +polaris principals get some_admin_user +``` + +#### list + +The `list` subcommand shows details about all principals. + +##### Examples + +``` +polaris principals list +``` + +#### rotate-credentials + +The `rotate-credentials` subcommand is used to update the credentials used by a principal. After this command runs successfully, the new credentials will be printed to stdout. + +``` +input: polaris principals rotate-credentials --help +options: + rotate-credentials + Positional arguments: + principal +``` + +##### Examples + +``` +polaris principals rotate-credentials some_user + +polaris principals rotate-credentials some_admin_user +``` + +#### update + +The `update` subcommand is used to update a principal. Currently, this supports rewriting the properties associated with a principal. + +``` +input: polaris principals update --help +options: + update + Named arguments: + --property A key/value pair such as: tag=value. Multiple can be provided by specifying this option more than once + Positional arguments: + principal +``` + +##### Examples + +``` +polaris principals update --property key=value --property other_key=other_value some_user + +polaris principals update --property are_other_keys_removed=yes some_user +``` + +### Principal Roles + +The `principal-roles` command is used to create, discover, and manage principal roles within Polaris. Additionally, this command can identify principals or catalog roles associated with a principal role, and can be used to grant a principal role to a principal. + +`principal-roles` supports the following subcommands: + +1. create +2. delete +3. get +4. list +5. update +6. grant +7. revoke + +#### create + +The `create` subcommand is used to create a new principal role. + +``` +input: polaris principal-roles create --help +options: + create + Named arguments: + --property A key/value pair such as: tag=value. Multiple can be provided by specifying this option more than once + Positional arguments: + principal_role +``` + +##### Examples + +``` +polaris principal-roles create data_engineer + +polaris principal-roles create --property key=value data_analyst +``` + +#### delete + +The `delete` subcommand is used to delete a principal role. + +``` +input: polaris principal-roles delete --help +options: + delete + Positional arguments: + principal_role +``` + +##### Examples + +``` +polaris principal-roles delete data_engineer + +polaris principal-roles delete data_analyst +``` + +#### get + +The `get` subcommand retrieves details about a principal role. + +``` +input: polaris principal-roles get --help +options: + get + Positional arguments: + principal_role +``` + +##### Examples + +``` +polaris principal-roles get data_engineer + +polaris principal-roles get data_analyst +``` + +#### list + +The list subcommand is used to print out all principal roles or, alternatively, to list all principal roles associated with a given principal or with a given catalog role. + +``` +input: polaris principal-roles list --help +options: + list + Named arguments: + --catalog-role The name of a catalog role. If provided, show only principal roles assigned to this catalog role. + --principal The name of a principal. If provided, show only principal roles assigned to this principal. +``` + +##### Examples + +``` +polaris principal-roles list + +polaris principal-roles --principal d.knuth + +polaris principal-roles --catalog-role super_secret_data +``` + +#### update + +The `update` subcommand is used to update a principal role. Currently, this supports updating the properties tied to a principal role. + +``` +input: polaris principal-roles update --help +options: + update + Named arguments: + --property A key/value pair such as: tag=value. Multiple can be provided by specifying this option more than once + Positional arguments: + principal_role +``` + +##### Examples + +``` +polaris principal-roles update --property key=value2 data_engineer + +polaris principal-roles update data_analyst --property key=value3 +``` + +#### grant + +The `grant` subcommand is used to grant a principal role to a principal. + +``` +input: polaris principal-roles grant --help +options: + grant + Named arguments: + --principal A principal to grant this principal role to + Positional arguments: + principal_role +``` + +##### Examples + +``` +polaris principal-roles grant --principal d.knuth data_engineer + +polaris principal-roles grant data_scientist --principal a.ng +``` + +#### revoke + +The `revoke` subcommand is used to revoke a principal role from a principal. + +``` +input: polaris principal-roles revoke --help +options: + revoke + Named arguments: + --principal A principal to revoke this principal role from + Positional arguments: + principal_role +``` + +##### Examples + +``` +polaris principal-roles revoke --principal former.employee data_engineer + +polaris principal-roles revoke data_scientist --principal changed.role +``` + +### Catalog Roles + +The catalog-roles command is used to create, discover, and manage catalog roles within Polaris. Additionally, this command can be used to grant a catalog role to a principal role. + +`catalog-roles` supports the following subcommands: + +1. create +2. delete +3. get +4. list +5. update +6. grant +7. revoke + +#### create + +The `create` subcommand is used to create a new catalog role. + +``` +input: polaris catalog-roles create --help +options: + create + Named arguments: + --catalog The name of an existing catalog + --property A key/value pair such as: tag=value. Multiple can be provided by specifying this option more than once + Positional arguments: + catalog_role +``` + +##### Examples + +``` +polaris catalog-roles create --property key=value --catalog some_catalog sales_data + +polaris catalog-roles create --catalog other_catalog sales_data +``` + +#### delete + +The `delete` subcommand is used to delete a catalog role. + +``` +input: polaris catalog-roles delete --help +options: + delete + Named arguments: + --catalog The name of an existing catalog + Positional arguments: + catalog_role +``` + +##### Examples + +``` +polaris catalog-roles delete --catalog some_catalog sales_data + +polaris catalog-roles delete --catalog other_catalog sales_data +``` + +#### get + +The `get` subcommand retrieves details about a catalog role. + +``` +input: polaris catalog-roles get --help +options: + get + Named arguments: + --catalog The name of an existing catalog + Positional arguments: + catalog_role +``` + +##### Examples + +``` +polaris catalog-roles get --catalog some_catalog inventory_data + +polaris catalog-roles get --catalog other_catalog inventory_data +``` + +#### list + +The `list` subcommand is used to print all catalog roles. Alternatively, if a principal role is provided, only catalog roles associated with that principal are shown. + +``` +input: polaris catalog-roles list --help +options: + list + Named arguments: + --principal-role The name of a principal role + Positional arguments: + catalog +``` + +##### Examples + +``` +polaris catalog-roles list + +polaris catalog-roles list --principal-role data_engineer +``` + +#### update + +The `update` subcommand is used to update a catalog role. Currently, only updating properties associated with the catalog role is supported. + +``` +input: polaris catalog-roles update --help +options: + update + Named arguments: + --catalog The name of an existing catalog + --property A key/value pair such as: tag=value. Multiple can be provided by specifying this option more than once + Positional arguments: + catalog_role +``` + +##### Examples + +``` +polaris catalog-roles update --property contains_pii=true --catalog some_catalog sales_data + +polaris catalog-roles update sales_data --catalog some_catalog --property key=value +``` + +#### grant + +The `grant` subcommand is used to grant a catalog role to a principal role. + +``` +input: polaris catalog-roles grant --help +options: + grant + Named arguments: + --catalog The name of an existing catalog + --principal-role The name of a catalog role + Positional arguments: + catalog_role +``` + +##### Examples + +``` +polaris catalog-roles grant sensitive_data --catalog some_catalog --principal-role power_user + +polaris catalog-roles grant --catalog sales_data contains_cc_info_catalog_role --principal-role financial_analyst_role +``` + +#### revoke + +The `revoke` subcommand is used to revoke a catalog role from a principal role. + +``` +input: polaris catalog-roles revoke --help +options: + revoke + Named arguments: + --catalog The name of an existing catalog + --principal-role The name of a catalog role + Positional arguments: + catalog_role +``` + +##### Examples + +``` +polaris catalog-roles revoke sensitive_data --catalog some_catalog --principal-role power_user + +polaris catalog-roles revoke --catalog sales_data contains_cc_info_catalog_role --principal-role financial_analyst_role +``` + +### Namespaces + +The `namespaces` command is used to manage namespaces within Polaris. + +`namespaces` supports the following subcommands: + +1. create +2. delete +3. get +4. list + +#### create + +The `create` subcommand is used to create a new namespace. + +When creating a namespace with an explicit location, that location must reside within the parent catalog or namespace. + +``` +input: polaris namespaces create --help +options: + create + Named arguments: + --catalog The name of an existing catalog + --location If specified, the location at which to store the namespace and entities inside it + --property A key/value pair such as: tag=value. Multiple can be provided by specifying this option more than once + Positional arguments: + namespace +``` + +##### Examples + +``` +polaris namespaces create --catalog my_catalog outer + +polaris namespaces create --catalog my_catalog --location 's3://bucket/outer/inner_SUFFIX' outer.inner +``` + +#### delete + +The `delete` subcommand is used to delete a namespace. + +``` +input: polaris namespaces delete --help +options: + delete + Named arguments: + --catalog The name of an existing catalog + Positional arguments: + namespace +``` + +##### Examples + +``` +polaris namespaces delete outer_namespace.inner_namespace --catalog my_catalog + +polaris namespaces delete --catalog my_catalog outer_namespace +``` + +#### get + +The `get` subcommand retrieves details about a namespace. + +``` +input: polaris namespaces get --help +options: + get + Named arguments: + --catalog The name of an existing catalog + Positional arguments: + namespace +``` + +##### Examples + +``` +polaris namespaces get --catalog some_catalog a.b + +polaris namespaces get a.b.c --catalog some_catalog +``` + +#### list + +The `list` subcommand shows details about all namespaces directly within a catalog or, optionally, within some parent prefix in that catalog. + +``` +input: polaris namespaces list --help +options: + list + Named arguments: + --catalog The name of an existing catalog + --parent If specified, list namespaces inside this parent namespace +``` + +##### Examples + +``` +polaris namespaces list --catalog my_catalog + +polaris namespaces list --catalog my_catalog --parent a + +polaris namespaces list --catalog my_catalog --parent a.b +``` + +### Privileges + +The `privileges` command is used to grant various privileges to a catalog role, or to revoke those privileges. Privileges can be on the level of a catalog, a namespace, a table, or a view. For more information on privileges, please refer to the [docs](./entities.md#privilege). + +Note that when using the `privileges` command, the user specifies the relevant catalog and catalog role before selecting a subcommand. + +`privileges` supports the following subcommands: + +1. list +2. catalog +3. namespace +4. table +5. view + +Each of these subcommands, except `list`, supports the `grant` and `revoke` actions and requires an action to be specified. + +Note that each subcommand's `revoke` action always accepts the same options that the corresponding `grant` action does, but with the addition of the `cascade` option. `cascade` is used to revoke all other privileges that depend on the specified privilege. + +#### list + +The `list` subcommand shows details about all privileges for a catalog role. + +``` +input: polaris privileges list --help +options: + list + Named arguments: + --catalog The name of an existing catalog + --catalog-role The name of a catalog role +``` + +##### Examples + +``` +polaris privileges list --catalog my_catalog --catalog-role my_role + +polaris privileges my_role list --catalog-role my_other_role --catalog my_catalog +``` + +#### catalog + +The `catalog` subcommand manages privileges at the catalog level. `grant` is used to grant catalog privileges to the specified catalog role, and `revoke` is used to revoke them. + +``` +input: polaris privileges catalog --help +options: + catalog + grant + Named arguments: + --catalog The name of an existing catalog + --catalog-role The name of a catalog role + Positional arguments: + privilege + revoke + Named arguments: + --cascade When revoking privileges, additionally revoke privileges that depend on the specified privilege + --catalog The name of an existing catalog + --catalog-role The name of a catalog role + Positional arguments: + privilege +``` + +##### Examples + +``` +polaris privileges \ + catalog \ + grant \ + --catalog my_catalog \ + --catalog-role catalog_role \ + TABLE_CREATE + +polaris privileges \ + catalog \ + revoke \ + --catalog my_catalog \ + --catalog-role catalog_role \ + --cascade \ + TABLE_CREATE +``` + +#### namespace + +The `namespace` subcommand manages privileges at the namespace level. + +``` +input: polaris privileges namespace --help +options: + namespace + grant + Named arguments: + --namespace A period-delimited namespace + --catalog The name of an existing catalog + --catalog-role The name of a catalog role + Positional arguments: + privilege + revoke + Named arguments: + --namespace A period-delimited namespace + --cascade When revoking privileges, additionally revoke privileges that depend on the specified privilege + --catalog The name of an existing catalog + --catalog-role The name of a catalog role + Positional arguments: + privilege +``` + +##### Examples + +``` +polaris privileges \ + namespace \ + grant \ + --catalog my_catalog \ + --catalog-role catalog_role \ + --namespace a.b \ + TABLE_LIST + +polaris privileges \ + namespace \ + revoke \ + --catalog my_catalog \ + --catalog-role catalog_role \ + --namespace a.b \ + TABLE_LIST +``` + +#### table + +The `table` subcommand manages privileges at the table level. + +``` +input: polaris privileges table --help +options: + table + grant + Named arguments: + --namespace A period-delimited namespace + --table The name of a table + --catalog The name of an existing catalog + --catalog-role The name of a catalog role + Positional arguments: + privilege + revoke + Named arguments: + --namespace A period-delimited namespace + --table The name of a table + --cascade When revoking privileges, additionally revoke privileges that depend on the specified privilege + --catalog The name of an existing catalog + --catalog-role The name of a catalog role + Positional arguments: + privilege +``` + +##### Examples + +``` +polaris privileges \ + table \ + grant \ + --catalog my_catalog \ + --catalog-role catalog_role \ + --namespace a.b \ + --table t \ + TABLE_DROP + +polaris privileges \ + table \ + grant \ + --catalog my_catalog \ + --catalog-role catalog_role \ + --namespace a.b \ + --table t \ + --cascade \ + TABLE_DROP +``` + +#### view + +The `view` subcommand manages privileges at the view level. + +``` +input: polaris privileges view --help +options: + view + grant + Named arguments: + --namespace A period-delimited namespace + --view The name of a view + --catalog The name of an existing catalog + --catalog-role The name of a catalog role + Positional arguments: + privilege + revoke + Named arguments: + --namespace A period-delimited namespace + --view The name of a view + --cascade When revoking privileges, additionally revoke privileges that depend on the specified privilege + --catalog The name of an existing catalog + --catalog-role The name of a catalog role + Positional arguments: + privilege +``` + +##### Examples + +``` +polaris privileges \ + view \ + grant \ + --catalog my_catalog \ + --catalog-role catalog_role \ + --namespace a.b.c \ + --view v \ + VIEW_FULL_METADATA + +polaris privileges \ + view \ + grant \ + --catalog my_catalog \ + --catalog-role catalog_role \ + --namespace a.b.c \ + --view v \ + --cascade \ + VIEW_FULL_METADATA +``` + +## Examples + +This section outlines example code for a few common operations as well as for some more complex ones. + +For especially complex operations, you may wish to instead directly use the Python API. + +### Creating a principal and a catalog + +``` +polaris principals create my_user + +polaris catalogs create \ + --type internal \ + --storage-type s3 \ + --default-base-location s3://iceberg-bucket/polaris-base \ + --role-arn arn:aws:iam::111122223333:role/ExampleCorpRole \ + --allowed-location s3://iceberg-bucket/polaris-alt-location-1 \ + --allowed-location s3://iceberg-bucket/polaris-alt-location-2 \ + my_catalog +``` + +### Granting a principal the ability to manage the content of a catalog + +``` +polaris principal-roles create power_user +polaris principal-roles grant --principal my_user power_user + +polaris catalog-roles create --catalog my_catalog my_catalog_role +polaris catalog-roles grant \ + --catalog my_catalog \ + --principal-role power_user \ + my_catalog_role + +polaris privileges \ + catalog \ + --catalog my_catalog \ + --catalog-role my_catalog_role \ + grant \ + CATALOG_MANAGE_CONTENT +``` + +### Identifying the tables a given principal has been granted explicit access to read + +_Note that some other privileges, such as `CATALOG_MANAGE_CONTENT`, subsume `TABLE_READ_DATA` and would not be discovered here._ + +``` +principal_roles=$(polaris principal-roles list --principal my_principal) +for principal_role in ${principal_roles}; do + catalog_roles=$(polaris catalog-roles --list --principal-role "${principal_role}") + for catalog_role in ${catalog_roles}; do + grants=$(polaris privileges list --catalog-role "${catalog_role}" --catalog "${catalog}") + for grant in $(echo "${grants}" | jq -c '.[] | select(.privilege == "TABLE_READ_DATA")'); do + echo "${grant}" + fi + done + done +done +``` + + diff --git a/docs/entities.md b/docs/entities.md index c0cf1d650f..7b01ece075 100644 --- a/docs/entities.md +++ b/docs/entities.md @@ -32,7 +32,7 @@ For details on how to use Storage Types in the REST API, see [the API docs](../r A namespace is a logical entity that resides within a [catalog](#catalog) and can contain other entities such as [tables](#table) or [views](#view). Some other systems may refer to namespaces as _schemas_ or _databases_. -In Polaris, namespaces can be nested up to 16 levels. For example, `a.b.c.d.e.f.g` is a valid namespace. `b` is said to reside within `a`, and so on. +In Polaris, namespaces can be nested. For example, `a.b.c.d.e.f.g` is a valid namespace. `b` is said to reside within `a`, and so on. For information on managing namespaces with the REST API or for more information on what data can be associated with a namespace, see [the API docs](../regtests/client/python/docs/CreateNamespaceRequest.md). diff --git a/docs/quickstart.md b/docs/quickstart.md index 172c299267..797f6fe643 100644 --- a/docs/quickstart.md +++ b/docs/quickstart.md @@ -16,7 +16,7 @@ # Quick Start -This guide serves as a introduction to several key entities that can be managed with Polaris, describes how to build and deploy Polaris locally, and finally includes examples of how to use Polaris with Spark and Trino. +This guide serves as a introduction to several key entities that can be managed with Polaris, describes how to build and deploy Polaris locally, and finally includes examples of how to use Polaris with Apache Spark. ## Prerequisites @@ -39,23 +39,19 @@ git clone https://github.com/polaris-catalog/polaris.git #### With Docker -If you plan to deploy Polaris inside [Docker](https://www.docker.com/)], you'll need to install docker itself. For can be done using [homebrew](https://brew.sh/): +If you plan to deploy Polaris inside [Docker](https://www.docker.com/), you'll need to install docker itself. For example, this can be done using [homebrew](https://brew.sh/): ``` -brew install docker +brew install --cask docker ``` -Once installed, make sure Docker is running. This can be done on macOS with: - -``` -open -a Docker -``` +Once installed, make sure Docker is running. #### From Source If you plan to build Polaris from source yourself, you will need to satisfy a few prerequisites first. -Polaris is built using [gradle](https://gradle.org/) and is compatible with Java 21. We recommend the use of [jenv](https://www.jenv.be/) to manage multiple Java versions. For example, to install Java 21 via [homebre]w(https://brew.sh/) and configure it with jenv: +Polaris is built using [gradle](https://gradle.org/) and is compatible with Java 21. We recommend the use of [jenv](https://www.jenv.be/) to manage multiple Java versions. For example, to install Java 21 via [homebrew](https://brew.sh/) and configure it with jenv: ``` cd ~/polaris @@ -77,13 +73,13 @@ If you want to connect to Polaris with [Apache Spark](https://spark.apache.org/) brew install git ``` -Then, clone Spark and check out a versioned branch. This guide uses [Spark 3.5.0](https://spark.apache.org/releases/spark-release-3-5-0.html). +Then, clone Spark and check out a versioned branch. This guide uses [Spark 3.5](https://spark.apache.org/releases/spark-release-3-5-0.html). ``` cd ~ git clone https://github.com/apache/spark.git cd ~/spark -git checkout branch-3.5.0 +git checkout branch-3.5 ``` ## Deploying Polaris @@ -128,7 +124,7 @@ For this tutorial, we'll launch an instance of Polaris that stores entities only When Polaris is launched using in-memory mode the root `CLIENT_ID` and `CLIENT_SECRET` can be found in stdout on initial startup. For example: ``` -Bootstrapped with credentials: {"client-id": "XXXX", "client-secret": "YYYY"} +realm: default-realm root principal credentials: XXXX:YYYY ``` Be sure to note of these credentials as we'll be using them below. @@ -230,10 +226,10 @@ In order to give this principal the ability to interact with the catalog, we mus --client-id ${CLIENT_ID} \ --client-secret ${CLIENT_SECRET} \ privileges \ - --catalog quickstart_catalog \ - --catalog-role quickstart_catalog_role \ catalog \ grant \ + --catalog quickstart_catalog \ + --catalog-role quickstart_catalog_role \ CATALOG_MANAGE_CONTENT ``` @@ -251,7 +247,7 @@ At this point, we’ve created a principal and granted it the ability to manage To use a Polaris-managed catalog in [Apache Spark](https://spark.apache.org/), we can configure Spark to use the Iceberg catalog REST API. -This guide uses [Apache Spark 3.5](https://spark.apache.org/releases/spark-release-3-5-0.html), but be sure to find [the appropriate iceberg-spark package for your Spark version](https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-spark). With a local Spark clone, we on the `branch-3.5` branch we can run the following: +This guide uses [Apache Spark 3.5](https://spark.apache.org/releases/spark-release-3-5-0.html), but be sure to find [the appropriate iceberg-spark package for your Spark version](https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-spark). From a local Spark clone on the `branch-3.5` branch we can run the following: _Note: the credentials provided here are those for our principal, not the root credentials._ @@ -311,10 +307,10 @@ If at any time access is revoked... --client-id ${CLIENT_ID} \ --client-secret ${CLIENT_SECRET} \ privileges \ - --catalog quickstart_catalog \ - --catalog-role quickstart_catalog_role \ catalog \ revoke \ + --catalog quickstart_catalog \ + --catalog-role quickstart_catalog_role \ CATALOG_MANAGE_CONTENT ``` diff --git a/polaris b/polaris index fb71f5c81b..255aaa04ae 100755 --- a/polaris +++ b/polaris @@ -31,5 +31,11 @@ fi pushd $SCRIPT_DIR > /dev/null PYTHONPATH=regtests/client/python ${SCRIPT_DIR}/polaris-venv/bin/python3 regtests/client/python/cli/polaris_cli.py "$@" +status=$? popd > /dev/null +if [ $status -ne 0 ]; then + exit 1 +fi + +exit 0 diff --git a/regtests/Dockerfile b/regtests/Dockerfile index 99d762af51..3d88d85061 100644 --- a/regtests/Dockerfile +++ b/regtests/Dockerfile @@ -34,8 +34,12 @@ WORKDIR /home/spark/regtests COPY ./setup.sh /home/spark/regtests/setup.sh COPY ./pyspark-setup.sh /home/spark/regtests/pyspark-setup.sh COPY ./client/python /home/spark/regtests/client/python +COPY ./polaris /home/spark -RUN ./setup.sh +RUN python3 -m venv /home/spark/polaris-venv && \ + . /home/spark/polaris-venv/bin/activate && \ + pip install poetry==1.5.0 && \ + deactivate COPY --chown=spark . /home/spark/regtests diff --git a/regtests/client/python/cli/command/__init__.py b/regtests/client/python/cli/command/__init__.py index 099e3bf0d7..6c94eee89f 100644 --- a/regtests/client/python/cli/command/__init__.py +++ b/regtests/client/python/cli/command/__init__.py @@ -93,12 +93,23 @@ def options_get(key, f=lambda x: x): action=options_get(f'{subcommand}_subcommand'), catalog_name=options_get(Arguments.CATALOG), catalog_role_name=options_get(Arguments.CATALOG_ROLE), - namespace=options_get(Arguments.NAMESPACE, lambda s: s.split('.')), + namespace=options_get(Arguments.NAMESPACE, lambda s: s.split('.') if s else None), view=options_get(Arguments.VIEW), table=options_get(Arguments.TABLE), privilege=options_get(Arguments.PRIVILEGE), cascade=options_get(Arguments.CASCADE) ) + elif options.command == Commands.NAMESPACES: + from cli.command.namespaces import NamespacesCommand + subcommand = options_get(f'{Commands.NAMESPACES}_subcommand') + command = NamespacesCommand( + subcommand, + catalog=options_get(Arguments.CATALOG), + namespace=options_get(Arguments.NAMESPACE, lambda s: s.split('.')), + parent=options_get(Arguments.PARENT, lambda s: s.split('.') if s else None), + location=options_get(Arguments.LOCATION), + properties=properties + ) if command is not None: command.validate() diff --git a/regtests/client/python/cli/command/catalog_roles.py b/regtests/client/python/cli/command/catalog_roles.py index 56b64dd96d..2901358e54 100644 --- a/regtests/client/python/cli/command/catalog_roles.py +++ b/regtests/client/python/cli/command/catalog_roles.py @@ -19,7 +19,8 @@ from pydantic import StrictStr from cli.command import Command -from cli.constants import Subcommands +from cli.constants import Subcommands, Arguments +from cli.options.option_tree import Argument from polaris.management import PolarisDefaultApi, CreateCatalogRoleRequest, CatalogRole, UpdateCatalogRoleRequest, \ GrantCatalogRoleRequest @@ -45,10 +46,10 @@ class CatalogRolesCommand(Command): def validate(self): if not self.catalog_name: - raise Exception("Missing required argument: --catalog") + raise Exception(f'Missing required argument: {Argument.to_flag_name(Arguments.CATALOG)}') if self.catalog_roles_subcommand in {Subcommands.GRANT, Subcommands.REVOKE}: if not self.principal_role_name: - raise Exception("Missing required argument: --principal") + raise Exception(f'Missing required argument: {Argument.to_flag_name(Arguments.PRINCIPAL_ROLE)}') def execute(self, api: PolarisDefaultApi) -> None: if self.catalog_roles_subcommand == Subcommands.CREATE: @@ -90,4 +91,4 @@ def execute(self, api: PolarisDefaultApi) -> None: api.revoke_catalog_role_from_principal_role( self.principal_role_name, self.catalog_name, self.catalog_role_name) else: - raise Exception(f"{self.catalog_roles_subcommand} is not supported in the CLI") + raise Exception(f'{self.catalog_roles_subcommand} is not supported in the CLI') diff --git a/regtests/client/python/cli/command/catalogs.py b/regtests/client/python/cli/command/catalogs.py index e4b3410ef4..8f79c0082c 100644 --- a/regtests/client/python/cli/command/catalogs.py +++ b/regtests/client/python/cli/command/catalogs.py @@ -19,7 +19,8 @@ from pydantic import StrictStr from cli.command import Command -from cli.constants import StorageType, CatalogType, Subcommands +from cli.constants import StorageType, CatalogType, Subcommands, Arguments +from cli.options.option_tree import Argument from polaris.management import PolarisDefaultApi, Catalog, CreateCatalogRequest, UpdateCatalogRequest, \ StorageConfigInfo, ExternalCatalog, AwsStorageConfigInfo, AzureStorageConfigInfo, GcpStorageConfigInfo, \ PolarisCatalog, CatalogProperties @@ -57,35 +58,42 @@ class CatalogsCommand(Command): def validate(self): if self.catalogs_subcommand == Subcommands.CREATE: if not self.storage_type: - raise Exception(f"Missing required argument:" - f" --storage-type") + raise Exception(f'Missing required argument:' + f' {Argument.to_flag_name(Arguments.STORAGE_TYPE)}') if not self.default_base_location: - raise Exception(f"Missing required argument:" - f" --default-base-location") - if self.catalog_type == CatalogType.EXTERNAL.value: - if not self.remote_url: - raise Exception(f"Missing required argument for {CatalogType.EXTERNAL.value} catalog:" - f" --remote-url") + raise Exception(f'Missing required argument:' + f' {Argument.to_flag_name(Arguments.DEFAULT_BASE_LOCATION)}') if self.catalogs_subcommand == Subcommands.UPDATE: if self.allowed_locations: if not self.storage_type: - raise Exception(f"Missing required argument when updating allowed locations for a catalog:" - f" --storage-type") + raise Exception(f'Missing required argument when updating allowed locations for a catalog:' + f' {Argument.to_flag_name(Arguments.STORAGE_TYPE)}') if self.storage_type == StorageType.S3.value: if not self.role_arn: - raise Exception("Missing required argument for storage type 's3': --role-arn") + raise Exception(f"Missing required argument for storage type 's3':" + f" {Argument.to_flag_name(Arguments.ROLE_ARN)}") if self._has_azure_storage_info() or self._has_gcs_storage_info(): - raise Exception("Storage type 's3' supports the storage configurations --role-arn, " - "--external-id, and --user-arn") + raise Exception(f"Storage type 's3' supports the storage credentials" + f" {Argument.to_flag_name(Arguments.ROLE_ARN)}," + f" {Argument.to_flag_name(Arguments.EXTERNAL_ID)}, and" + f" {Argument.to_flag_name(Arguments.USER_ARN)}") elif self.storage_type == StorageType.AZURE.value: if not self.tenant_id: - raise Exception("Missing required argument for storage type 'azure': --tenant-id") + raise Exception("Missing required argument for storage type 'azure': " + f" {Argument.to_flag_name(Arguments.TENANT_ID)}") if self._has_aws_storage_info() or self._has_gcs_storage_info(): - raise Exception("Storage type 'azure' supports the storage configurations --tenant-id, " - "--multi-tenant-app-name, and --consent-url") - elif self._has_aws_storage_info() or self._has_azure_storage_info(): - raise Exception("Storage type 'gcs' supports the storage configuration: --service-account") + raise Exception("Storage type 'azure' supports the storage credentials" + f" {Argument.to_flag_name(Arguments.TENANT_ID)}," + f" {Argument.to_flag_name(Arguments.MULTI_TENANT_APP_NAME)}, and" + f" {Argument.to_flag_name(Arguments.CONSENT_URL)}") + elif self.storage_type == StorageType.GCS.value: + if self._has_aws_storage_info() or self._has_azure_storage_info(): + raise Exception("Storage type 'gcs' supports the storage credential" + f" {Argument.to_flag_name(Arguments.SERVICE_ACCOUNT)}") + elif self.storage_type == StorageType.FILE.value: + if self._has_aws_storage_info() or self._has_azure_storage_info() or self._has_gcs_storage_info(): + raise Exception("Storage type 'file' does not support any storage credentials") def _has_aws_storage_info(self): return self.role_arn or self.external_id or self.user_arn @@ -121,6 +129,11 @@ def _build_storage_config_info(self): tenant_id=self.tenant_id, multi_tenant_app_name=self.multi_tenant_app_name ) + elif self.storage_type == StorageType.FILE.value: + config = StorageConfigInfo( + storage_type=self.storage_type.upper(), + allowed_locations=self.allowed_locations + ) return config def execute(self, api: PolarisDefaultApi) -> None: @@ -161,17 +174,17 @@ def execute(self, api: PolarisDefaultApi) -> None: print(catalog.to_json()) elif self.catalogs_subcommand == Subcommands.UPDATE: catalog = api.get_catalog(self.catalog_name) - default_base_location_properties = {} - if self.default_base_location: - default_base_location_properties = {'default-base-location': self.default_base_location} - catalog.properties = {**default_base_location_properties, **self.properties} - + if self.default_base_location or self.properties: + catalog.properties = CatalogProperties( + default_base_location=self.default_base_location, + additional_properties=self.properties + ) request = UpdateCatalogRequest( current_entity_version=catalog.entity_version, catalog=catalog ) - if (self.allowed_locations or self._has_aws_storage_info() or self._has_azure_storage_info() or - self._has_gcs_storage_info()): + if (self._has_aws_storage_info() or self._has_azure_storage_info() or self._has_gcs_storage_info() or + self.allowed_locations or self.default_base_location): request = UpdateCatalogRequest( current_entity_version=catalog.entity_version, catalog=catalog, @@ -180,5 +193,5 @@ def execute(self, api: PolarisDefaultApi) -> None: api.update_catalog(self.catalog_name, request) else: - raise Exception(f"{self.catalogs_subcommand} is not supported in the CLI") + raise Exception(f'{self.catalogs_subcommand} is not supported in the CLI') diff --git a/regtests/client/python/cli/command/namespaces.py b/regtests/client/python/cli/command/namespaces.py new file mode 100644 index 0000000000..39357952a9 --- /dev/null +++ b/regtests/client/python/cli/command/namespaces.py @@ -0,0 +1,79 @@ +import json +import re +from dataclasses import dataclass +from typing import Dict, Optional, List + +from pydantic import StrictStr + +from cli.command import Command +from cli.constants import Subcommands, Arguments, UNIT_SEPARATOR +from cli.options.option_tree import Argument +from polaris.catalog import IcebergCatalogAPI, CreateNamespaceRequest, ApiClient, Configuration +from polaris.catalog.exceptions import NotFoundException +from polaris.management import PolarisDefaultApi + + +@dataclass +class NamespacesCommand(Command): + """ + A Command implementation to represent `polaris namespaces`. The instance attributes correspond to parameters + that can be provided to various subcommands + + Example commands: + * ./polaris namespaces create --catalog my_schema my_namespace + * ./polaris namespaces list --catalog my_catalog + * ./polaris namespaces delete --catalog my_catalog my_namespace.inner + """ + + namespaces_subcommand: str + catalog: str + namespace: List[StrictStr] + parent: List[StrictStr] + location: str + properties: Optional[Dict[str, StrictStr]] + + def validate(self): + if not self.catalog: + raise Exception(f'Missing required argument:' + f' {Argument.to_flag_name(Arguments.CATALOG)}') + + def _get_catalog_api(self, api: PolarisDefaultApi): + """ + Convert a management API to a catalog API + """ + catalog_host = re.match(r'(http://[^/]+)', api.api_client.configuration.host).group(1) + configuration = Configuration( + host=f'{catalog_host}/api/catalog', + username=api.api_client.configuration.username, + password=api.api_client.configuration.password, + access_token=api.api_client.configuration.access_token, + ) + return IcebergCatalogAPI(ApiClient(configuration)) + + def execute(self, api: PolarisDefaultApi) -> None: + catalog_api = self._get_catalog_api(api) + if self.namespaces_subcommand == Subcommands.CREATE: + properties = self.properties or {} + if self.location: + properties = {**properties, Arguments.LOCATION: self.location} + request = CreateNamespaceRequest( + namespace=self.namespace, + properties=self.properties + ) + catalog_api.create_namespace( + prefix=self.catalog, + create_namespace_request=request) + elif self.namespaces_subcommand == Subcommands.LIST: + if self.parent is not None: + result = catalog_api.list_namespaces(prefix=self.catalog, parent=UNIT_SEPARATOR.join(self.parent)) + else: + result = catalog_api.list_namespaces(prefix=self.catalog) + for namespace in result.namespaces: + print(json.dumps({"namespace": '.'.join(namespace)})) + elif self.namespaces_subcommand == Subcommands.DELETE: + catalog_api.drop_namespace(prefix=self.catalog, namespace=UNIT_SEPARATOR.join(self.namespace)) + elif self.namespaces_subcommand == Subcommands.GET: + catalog_api.namespace_exists(prefix=self.catalog, namespace=UNIT_SEPARATOR.join(self.namespace)) + print(json.dumps({"namespace": '.'.join(self.namespace)})) + else: + raise Exception(f"{self.namespaces_subcommand} is not supported in the CLI") diff --git a/regtests/client/python/cli/command/principal_roles.py b/regtests/client/python/cli/command/principal_roles.py index cfbb440714..b017901211 100644 --- a/regtests/client/python/cli/command/principal_roles.py +++ b/regtests/client/python/cli/command/principal_roles.py @@ -19,7 +19,8 @@ from pydantic import StrictStr from cli.command import Command -from cli.constants import Subcommands +from cli.constants import Subcommands, Arguments +from cli.options.option_tree import Argument from polaris.management import PolarisDefaultApi, CreatePrincipalRoleRequest, PrincipalRole, UpdatePrincipalRoleRequest, \ GrantCatalogRoleRequest, CatalogRole, GrantPrincipalRoleRequest @@ -46,10 +47,12 @@ class PrincipalRolesCommand(Command): def validate(self): if self.principal_roles_subcommand == Subcommands.LIST: if self.principal_name and self.catalog_role_name: - raise Exception('You may provide either --principal or --catalog-role, but not both') + raise Exception(f'You may provide either {Argument.to_flag_name(Arguments.PRINCIPAL)} or' + f' {Argument.to_flag_name(Arguments.CATALOG_ROLE)}, but not both') if self.principal_roles_subcommand in {Subcommands.GRANT, Subcommands.REVOKE}: if not self.principal_name: - raise Exception(f"Missing required argument for {self.principal_roles_subcommand}: --principal") + raise Exception(f'Missing required argument for {self.principal_roles_subcommand}:' + f' {Argument.to_flag_name(Arguments.PRINCIPAL)}') def execute(self, api: PolarisDefaultApi) -> None: if self.principal_roles_subcommand == Subcommands.CREATE: diff --git a/regtests/client/python/cli/command/principals.py b/regtests/client/python/cli/command/principals.py index f8174a38a2..85c3cfe799 100644 --- a/regtests/client/python/cli/command/principals.py +++ b/regtests/client/python/cli/command/principals.py @@ -20,8 +20,7 @@ from cli.command import Command from cli.constants import Subcommands -from polaris.management import PolarisDefaultApi, CreatePrincipalRequest, Principal, UpdatePrincipalRequest, \ - GrantPrincipalRoleRequest, PrincipalRole +from polaris.management import PolarisDefaultApi, CreatePrincipalRequest, Principal, UpdatePrincipalRequest @dataclass diff --git a/regtests/client/python/cli/command/privileges.py b/regtests/client/python/cli/command/privileges.py index 92b876eef5..108df21503 100644 --- a/regtests/client/python/cli/command/privileges.py +++ b/regtests/client/python/cli/command/privileges.py @@ -19,7 +19,8 @@ from pydantic import StrictStr from cli.command import Command -from cli.constants import Subcommands, Actions +from cli.constants import Subcommands, Actions, Arguments +from cli.options.option_tree import Argument from polaris.management import PolarisDefaultApi, AddGrantRequest, NamespaceGrant, \ RevokeGrantRequest, CatalogGrant, TableGrant, ViewGrant, CatalogPrivilege, NamespacePrivilege, TablePrivilege, \ ViewPrivilege @@ -33,9 +34,9 @@ class PrivilegesCommand(Command): `action`, represent parameters provided to either the `grant` or `revoke` action. Example commands: - * ./polaris privileges --catalog c --catalog-role cr table grant --namespace n --table t PRIVILEGE_NAME - * ./polaris privileges --catalog c --catalog-role cr namespace revoke --namespace n PRIVILEGE_NAME - * ./polaris privileges -catalog c --catalog-role cr list + * ./polaris privileges table grant --catalog c --catalog-role cr --namespace n --table t PRIVILEGE_NAME + * ./polaris privileges namespace revoke --catalog c --catalog-role cr --namespace n PRIVILEGE_NAME + * ./polaris privileges list --catalog c --catalog-role cr """ privileges_subcommand: str @@ -50,13 +51,15 @@ class PrivilegesCommand(Command): def validate(self): if not self.catalog_name: - raise Exception('Missing required argument: --catalog') + raise Exception(f'Missing required argument: {Argument.to_flag_name(Arguments.CATALOG)}') if not self.catalog_role_name: - raise Exception('Missing required argument: --catalog-role') + raise Exception(f'Missing required argument: {Argument.to_flag_name(Arguments.CATALOG_ROLE)}') + if not self.privileges_subcommand: + raise Exception('A subcommand must be provided') if (self.privileges_subcommand in {Subcommands.NAMESPACE, Subcommands.TABLE, Subcommands.VIEW} and not self.namespace): - raise Exception('Missing required argument: --namespace') + raise Exception(f'Missing required argument: {Argument.to_flag_name(Arguments.NAMESPACE)}') if self.action == Actions.GRANT and self.cascade: raise Exception('Unrecognized argument for GRANT: --cascade') diff --git a/regtests/client/python/cli/constants.py b/regtests/client/python/cli/constants.py index 210debaacd..4f47adfa81 100644 --- a/regtests/client/python/cli/constants.py +++ b/regtests/client/python/cli/constants.py @@ -18,12 +18,13 @@ class StorageType(Enum): """ - Represents a Storage Type within the Polaris API -- `s3`, `azure`, or `gcs`. + Represents a Storage Type within the Polaris API -- `s3`, `azure`, `gcs`, or `file`. """ S3 = 's3' AZURE = 'azure' GCS = 'gcs' + FILE = 'file' class CatalogType(Enum): @@ -53,6 +54,7 @@ class Commands: PRINCIPAL_ROLES = 'principal-roles' CATALOG_ROLES = 'catalog-roles' PRIVILEGES = 'privileges' + NAMESPACES = 'namespaces' class Subcommands: @@ -117,6 +119,12 @@ class Arguments: TABLE = 'table' VIEW = 'view' CASCADE = 'cascade' + CLIENT_SECRET = 'client_secret' + ACCESS_TOKEN = 'access_token' + HOST = 'host' + PORT = 'port' + PARENT = 'parent' + LOCATION = 'location' class Hints: @@ -135,24 +143,25 @@ class Catalogs: class Create: TYPE = 'The type of catalog to create in [INTERNAL, EXTERNAL]. INTERNAL by default.' - REMOTE_URL = '(Only for external catalogs) The remote URL to use' - DEFAULT_BASE_LOCATION = '(Required for internal catalogs) Default base location of the catalog' - STORAGE_TYPE = '(Required for internal catalogs) The type of storage to use for the catalog' - ALLOWED_LOCATION = ('(For internal catalogs) An allowed location for files tracked by the catalog. ' + REMOTE_URL = '(For external catalogs) The remote URL to use' + DEFAULT_BASE_LOCATION = '(Required) Default base location of the catalog' + STORAGE_TYPE = '(Required) The type of storage to use for the catalog' + ALLOWED_LOCATION = ('An allowed location for files tracked by the catalog. ' 'Multiple locations can be provided by specifying this option more than once.') - ROLE_ARN = '(Required for AWS) A role ARN to use when connecting to S3' - EXTERNAL_ID = '(Only for AWS) The external Id to use when connecting to S3' - USER_ARN = '(Only for AWS) A user ARN to use when connecting to S3' + ROLE_ARN = '(Required for S3) A role ARN to use when connecting to S3' + EXTERNAL_ID = '(Only for S3) The external ID to use when connecting to S3' + USER_ARN = '(Only for S3) A user ARN to use when connecting to S3' TENANT_ID = '(Required for Azure) A tenant ID to use when connecting to Azure Storage' MULTI_TENANT_APP_NAME = '(Only for Azure) The app name to use when connecting to Azure Storage' CONSENT_URL = '(Only for Azure) A consent URL granting permissions for the Azure Storage location' - SERVICE_ACCOUNT = '(Only for GCP) The service account to use when connecting to GCS' + SERVICE_ACCOUNT = '(Only for GCS) The service account to use when connecting to GCS' class Principals: class Create: + TYPE = 'The type of principal to create in [SERVICE]' NAME = 'The principal name' CLIENT_ID = 'The output-only OAuth clientId associated with this principal if applicable' @@ -179,15 +188,12 @@ class List: ' principal.') class CatalogRoles: - CATALOG_NAME = 'The name of a catalog' + CATALOG_NAME = 'The name of an existing catalog' CATALOG_ROLE = 'The name of a catalog role' LIST = 'List catalog roles within a catalog. Optionally, specify a principal role.' REVOKE_CATALOG_ROLE = 'Revoke a catalog role from a principal role' GRANT_CATALOG_ROLE = 'Grant a catalog role to a principal role' - class Create: - CATALOG_NAME = 'The name of an existing catalog' - class Grant: CATALOG_NAME = 'The name of a catalog' CATALOG_ROLE = 'The name of a catalog role' @@ -195,6 +201,13 @@ class Grant: NAMESPACE = 'A period-delimited namespace' TABLE = 'The name of a table' VIEW = 'The name of a view' - ADD = 'Add a grant. Either this or --revoke must be specified except when the subcommand is `list`' - REVOKE = 'Revoke a grant. Either this or --add must be specified except when the subcommand is `list`' CASCADE = 'When revoking privileges, additionally revoke privileges that depend on the specified privilege' + + class Namespaces: + LOCATION = 'If specified, the location at which to store the namespace and entities inside it' + PARENT = 'If specified, list namespaces inside this parent namespace' + + +UNIT_SEPARATOR = chr(0x1F) +CLIENT_ID_ENV = 'CLIENT_ID' +CLIENT_SECRET_ENV = 'CLIENT_SECRET' diff --git a/regtests/client/python/cli/options/option_tree.py b/regtests/client/python/cli/options/option_tree.py index 5c3a1f3b97..7df4309b35 100644 --- a/regtests/client/python/cli/options/option_tree.py +++ b/regtests/client/python/cli/options/option_tree.py @@ -32,14 +32,18 @@ class Argument: lower: bool = False allow_repeats: bool = False default: object = None - flag_name = None def __post_init__(self): if self.name.startswith('--'): raise Exception(f'Argument name {self.name} starts with `--`: should this be a flag_name?') + @staticmethod + def to_flag_name(argument_name): + return '--' + argument_name.replace('_', '-') + def get_flag_name(self): - return self.flag_name or ('--' + self.name.replace('_', '-')) + return Argument.to_flag_name(self.name) + @dataclass @@ -62,17 +66,9 @@ class OptionTree: configuration of the CLI and to generate a custom `--help` message including nested commands. """ - _STORAGE_CONFIG_INFO = [ - Argument(Arguments.STORAGE_TYPE, str, Hints.Catalogs.Create.STORAGE_TYPE, lower=True, - choices=[st.value for st in StorageType]), - Argument(Arguments.ALLOWED_LOCATION, str, Hints.Catalogs.Create.ALLOWED_LOCATION, allow_repeats=True), - Argument(Arguments.ROLE_ARN, str, Hints.Catalogs.Create.ROLE_ARN), - Argument(Arguments.EXTERNAL_ID, str, Hints.Catalogs.Create.EXTERNAL_ID), - Argument(Arguments.USER_ARN, str, Hints.Catalogs.Create.USER_ARN), - Argument(Arguments.TENANT_ID, str, Hints.Catalogs.Create.TENANT_ID), - Argument(Arguments.MULTI_TENANT_APP_NAME, str, Hints.Catalogs.Create.MULTI_TENANT_APP_NAME), - Argument(Arguments.CONSENT_URL, str, Hints.Catalogs.Create.CONSENT_URL), - Argument(Arguments.SERVICE_ACCOUNT, str, Hints.Catalogs.Create.SERVICE_ACCOUNT), + _CATALOG_ROLE_AND_CATALOG = [ + Argument(Arguments.CATALOG, str, Hints.CatalogRoles.CATALOG_NAME), + Argument(Arguments.CATALOG_ROLE, str, Hints.CatalogRoles.CATALOG_ROLE) ] @staticmethod @@ -82,25 +78,36 @@ def get_tree() -> List[Option]: Option(Subcommands.CREATE, args=[ Argument(Arguments.TYPE, str, Hints.Catalogs.Create.TYPE, lower=True, choices=[ct.value for ct in CatalogType], default=CatalogType.INTERNAL.value), - Argument(Arguments.REMOTE_URL, str, Hints.Catalogs.Create.REMOTE_URL), + Argument(Arguments.STORAGE_TYPE, str, Hints.Catalogs.Create.STORAGE_TYPE, lower=True, + choices=[st.value for st in StorageType]), Argument(Arguments.DEFAULT_BASE_LOCATION, str, Hints.Catalogs.Create.DEFAULT_BASE_LOCATION), + Argument(Arguments.ALLOWED_LOCATION, str, Hints.Catalogs.Create.ALLOWED_LOCATION, + allow_repeats=True), + Argument(Arguments.ROLE_ARN, str, Hints.Catalogs.Create.ROLE_ARN), + Argument(Arguments.EXTERNAL_ID, str, Hints.Catalogs.Create.EXTERNAL_ID), + Argument(Arguments.TENANT_ID, str, Hints.Catalogs.Create.TENANT_ID), + Argument(Arguments.MULTI_TENANT_APP_NAME, str, Hints.Catalogs.Create.MULTI_TENANT_APP_NAME), + Argument(Arguments.CONSENT_URL, str, Hints.Catalogs.Create.CONSENT_URL), + Argument(Arguments.SERVICE_ACCOUNT, str, Hints.Catalogs.Create.SERVICE_ACCOUNT), + Argument(Arguments.REMOTE_URL, str, Hints.Catalogs.Create.REMOTE_URL), Argument(Arguments.PROPERTY, str, Hints.PROPERTY, allow_repeats=True), - ] + OptionTree._STORAGE_CONFIG_INFO, input_name=Arguments.CATALOG), + ], input_name=Arguments.CATALOG), Option(Subcommands.DELETE, input_name=Arguments.CATALOG), Option(Subcommands.GET, input_name=Arguments.CATALOG), Option(Subcommands.LIST, args=[ Argument(Arguments.PRINCIPAL_ROLE, str, Hints.PrincipalRoles.PRINCIPAL_ROLE) ]), Option(Subcommands.UPDATE, args=[ - Argument(Arguments.PROPERTY, str, Hints.PROPERTY, allow_repeats=True), Argument(Arguments.DEFAULT_BASE_LOCATION, str, Hints.Catalogs.Create.DEFAULT_BASE_LOCATION), - ] + OptionTree._STORAGE_CONFIG_INFO, input_name=Arguments.CATALOG) + Argument(Arguments.ALLOWED_LOCATION, str, Hints.Catalogs.Create.ALLOWED_LOCATION, + allow_repeats=True), + Argument(Arguments.PROPERTY, str, Hints.PROPERTY, allow_repeats=True), + ], input_name=Arguments.CATALOG) ]), Option(Commands.PRINCIPALS, 'manage principals', children=[ Option(Subcommands.CREATE, args=[ - Argument(Arguments.TYPE, str, Hints.Catalogs.Create.TYPE, lower=True, + Argument(Arguments.TYPE, str, Hints.Principals.Create.TYPE, lower=True, choices=[pt.value for pt in PrincipalType], default=PrincipalType.SERVICE.value), - Argument(Arguments.CLIENT_ID, str, Hints.Principals.Create.CLIENT_ID), Argument(Arguments.PROPERTY, str, Hints.PROPERTY, allow_repeats=True) ], input_name=Arguments.PRINCIPAL), Option(Subcommands.DELETE, input_name=Arguments.PRINCIPAL), @@ -133,20 +140,20 @@ def get_tree() -> List[Option]: ]), Option(Commands.CATALOG_ROLES, 'manage catalog roles', children=[ Option(Subcommands.CREATE, args=[ - Argument(Arguments.CATALOG, str, Hints.CatalogRoles.Create.CATALOG_NAME), + Argument(Arguments.CATALOG, str, Hints.CatalogRoles.CATALOG_NAME), Argument(Arguments.PROPERTY, str, Hints.PROPERTY, allow_repeats=True) ], input_name=Arguments.CATALOG_ROLE), Option(Subcommands.DELETE, args=[ - Argument(Arguments.CATALOG, str, Hints.CatalogRoles.Create.CATALOG_NAME), + Argument(Arguments.CATALOG, str, Hints.CatalogRoles.CATALOG_NAME), ], input_name=Arguments.CATALOG_ROLE), Option(Subcommands.GET, args=[ - Argument(Arguments.CATALOG, str, Hints.CatalogRoles.Create.CATALOG_NAME), + Argument(Arguments.CATALOG, str, Hints.CatalogRoles.CATALOG_NAME), ], input_name=Arguments.CATALOG_ROLE), Option(Subcommands.LIST, hint=Hints.CatalogRoles.LIST, args=[ Argument(Arguments.PRINCIPAL_ROLE, str, Hints.PrincipalRoles.PRINCIPAL_ROLE) ], input_name=Arguments.CATALOG), Option(Subcommands.UPDATE, args=[ - Argument(Arguments.CATALOG, str, Hints.CatalogRoles.Create.CATALOG_NAME), + Argument(Arguments.CATALOG, str, Hints.CatalogRoles.CATALOG_NAME), Argument(Arguments.PROPERTY, str, Hints.PROPERTY, allow_repeats=True) ], input_name=Arguments.CATALOG_ROLE), Option(Subcommands.GRANT, hint=Hints.CatalogRoles.GRANT_CATALOG_ROLE, args=[ @@ -158,47 +165,61 @@ def get_tree() -> List[Option]: Argument(Arguments.PRINCIPAL_ROLE, str, Hints.CatalogRoles.CATALOG_ROLE) ], input_name=Arguments.CATALOG_ROLE) ]), - Option(Commands.PRIVILEGES, 'manage privileges for a catalog role', args=[ - Argument(Arguments.CATALOG, str, Hints.CatalogRoles.Create.CATALOG_NAME), - Argument(Arguments.CATALOG_ROLE, str, Hints.CatalogRoles.CATALOG_ROLE) - ], children=[ - Option(Subcommands.LIST), + Option(Commands.PRIVILEGES, 'manage privileges for a catalog role', children=[ + Option(Subcommands.LIST, args=OptionTree._CATALOG_ROLE_AND_CATALOG), Option(Subcommands.CATALOG, children=[ - Option(Actions.GRANT, input_name=Arguments.PRIVILEGE), + Option(Actions.GRANT, args=OptionTree._CATALOG_ROLE_AND_CATALOG, input_name=Arguments.PRIVILEGE), Option(Actions.REVOKE, args=[ Argument(Arguments.CASCADE, bool, Hints.Grant.CASCADE) - ], input_name=Arguments.PRIVILEGE), + ] + OptionTree._CATALOG_ROLE_AND_CATALOG, input_name=Arguments.PRIVILEGE), ]), Option(Subcommands.NAMESPACE, children=[ Option(Actions.GRANT, args=[ Argument(Arguments.NAMESPACE, str, Hints.Grant.NAMESPACE) - ], input_name=Arguments.PRIVILEGE), + ] + OptionTree._CATALOG_ROLE_AND_CATALOG, input_name=Arguments.PRIVILEGE), Option(Actions.REVOKE, args=[ Argument(Arguments.NAMESPACE, str, Hints.Grant.NAMESPACE), Argument(Arguments.CASCADE, bool, Hints.Grant.CASCADE) - ], input_name=Arguments.PRIVILEGE), + ] + OptionTree._CATALOG_ROLE_AND_CATALOG, input_name=Arguments.PRIVILEGE), ]), Option(Subcommands.TABLE, children=[ Option(Actions.GRANT, args=[ Argument(Arguments.NAMESPACE, str, Hints.Grant.NAMESPACE), Argument(Arguments.TABLE, str, Hints.Grant.TABLE) - ], input_name=Arguments.PRIVILEGE), + ] + OptionTree._CATALOG_ROLE_AND_CATALOG, input_name=Arguments.PRIVILEGE), Option(Actions.REVOKE, args=[ Argument(Arguments.NAMESPACE, str, Hints.Grant.NAMESPACE), Argument(Arguments.TABLE, str, Hints.Grant.TABLE), Argument(Arguments.CASCADE, bool, Hints.Grant.CASCADE) - ], input_name=Arguments.PRIVILEGE), + ] + OptionTree._CATALOG_ROLE_AND_CATALOG, input_name=Arguments.PRIVILEGE), ]), Option(Subcommands.VIEW, children=[ Option(Actions.GRANT, args=[ Argument(Arguments.NAMESPACE, str, Hints.Grant.NAMESPACE), Argument(Arguments.VIEW, str, Hints.Grant.VIEW) - ], input_name=Arguments.PRIVILEGE), + ] + OptionTree._CATALOG_ROLE_AND_CATALOG, input_name=Arguments.PRIVILEGE), Option(Actions.REVOKE, args=[ Argument(Arguments.NAMESPACE, str, Hints.Grant.NAMESPACE), Argument(Arguments.VIEW, str, Hints.Grant.VIEW), Argument(Arguments.CASCADE, bool, Hints.Grant.CASCADE) - ], input_name=Arguments.PRIVILEGE), + ] + OptionTree._CATALOG_ROLE_AND_CATALOG, input_name=Arguments.PRIVILEGE), ]) + ]), + Option(Commands.NAMESPACES, 'manage namespaces', children=[ + Option(Subcommands.CREATE, args=[ + Argument(Arguments.CATALOG, str, Hints.CatalogRoles.CATALOG_NAME), + Argument(Arguments.LOCATION, str, Hints.Namespaces.LOCATION), + Argument(Arguments.PROPERTY, str, Hints.PROPERTY, allow_repeats=True) + ], input_name=Arguments.NAMESPACE), + Option(Subcommands.LIST, args=[ + Argument(Arguments.CATALOG, str, Hints.CatalogRoles.CATALOG_NAME), + Argument(Arguments.PARENT, str, Hints.Namespaces.PARENT) + ]), + Option(Subcommands.DELETE, args=[ + Argument(Arguments.CATALOG, str, Hints.CatalogRoles.CATALOG_NAME) + ], input_name=Arguments.NAMESPACE), + Option(Subcommands.GET, args=[ + Argument(Arguments.CATALOG, str, Hints.CatalogRoles.CATALOG_NAME) + ], input_name=Arguments.NAMESPACE), ]) ] diff --git a/regtests/client/python/cli/options/parser.py b/regtests/client/python/cli/options/parser.py index 18a281bc93..863194ddd1 100644 --- a/regtests/client/python/cli/options/parser.py +++ b/regtests/client/python/cli/options/parser.py @@ -17,6 +17,7 @@ import sys from typing import List, Optional, Dict +from cli.constants import Arguments from cli.options.option_tree import OptionTree, Option, Argument @@ -33,11 +34,11 @@ class Parser(object): """ _ROOT_ARGUMENTS = [ - Argument('host', str, hint='hostname', default='localhost'), - Argument('port', int, hint='port', default=8181), - Argument('client-id', str, hint='client ID for token-based authentication'), - Argument('client-secret', str, hint='client secret for token-based authentication'), - Argument('access-token', str, hint='access token for token-based authentication'), + Argument(Arguments.HOST, str, hint='hostname', default='localhost'), + Argument(Arguments.PORT, int, hint='port', default=8181), + Argument(Arguments.CLIENT_ID, str, hint='client ID for token-based authentication'), + Argument(Arguments.CLIENT_SECRET, str, hint='client secret for token-based authentication'), + Argument(Arguments.ACCESS_TOKEN, str, hint='access token for token-based authentication'), ] @staticmethod @@ -151,13 +152,19 @@ def _get_tree_for_option(self, option: Option, indent=1) -> str: result = "" result += (TreeHelpParser.INDENT * indent) + option.name + if option.args: + result += '\n' + (TreeHelpParser.INDENT * (indent + 1)) + "Named arguments:" for arg in option.args: - result += '\n' + (TreeHelpParser.INDENT * (indent + 1)) + f"{arg.get_flag_name()} {arg.hint}" + result += '\n' + (TreeHelpParser.INDENT * (indent + 2)) + f"{arg.get_flag_name()} {arg.hint}" + + if option.input_name: + result += '\n' + (TreeHelpParser.INDENT * (indent + 1)) + "Positional arguments:" + result += '\n' + (TreeHelpParser.INDENT * (indent + 2)) + option.input_name if len(option.args) > 0 and len(option.children) > 0: result += '\n' - for child in option.children: + for child in sorted(option.children, key=lambda o: o.name): result += '\n' + self._get_tree_for_option(child, indent + 1) return result diff --git a/regtests/client/python/cli/polaris_cli.py b/regtests/client/python/cli/polaris_cli.py index 2b0d1ff1e0..bfe0b6eb22 100644 --- a/regtests/client/python/cli/polaris_cli.py +++ b/regtests/client/python/cli/polaris_cli.py @@ -13,8 +13,16 @@ # See the License for the specific language governing permissions and # limitations under the License. # + +import json +import os +import sys +from json import JSONDecodeError + +from cli.constants import Arguments, CLIENT_ID_ENV, CLIENT_SECRET_ENV +from cli.options.option_tree import Argument from cli.options.parser import Parser -from polaris.management import ApiClient, Configuration, ApiException +from polaris.management import ApiClient, Configuration from polaris.management import PolarisDefaultApi @@ -30,9 +38,12 @@ class PolarisCli: * ./polaris --client-id ${id} --client-secret ${secret} --host ${hostname} catalog-roles list """ + # Can be enabled if the client is able to authenticate directly without first fetching a token + DIRECT_AUTHENTICATION_ENABLED = False + @staticmethod - def execute(): - options = Parser.parse() + def execute(args=None): + options = Parser.parse(args) client_builder = PolarisCli._get_client_builder(options) with client_builder() as api_client: try: @@ -40,10 +51,39 @@ def execute(): admin_api = PolarisDefaultApi(api_client) command = Command.from_options(options) command.execute(admin_api) - except ApiException as e: - import json - error = json.loads(e.body)['error'] - print(f'Exception when communicating with the Polaris server. {error["type"]}: {error["message"]}') + except Exception as e: + PolarisCli._try_print_exception(e) + sys.exit(1) + + @staticmethod + def _try_print_exception(e): + try: + error = json.loads(e.body)['error'] + sys.stderr.write(f'Exception when communicating with the Polaris server.' + f' {error["type"]}: {error["message"]}{os.linesep}') + except JSONDecodeError as _: + sys.stderr.write(f'Exception when communicating with the Polaris server.' + f' {e.status}: {e.reason}{os.linesep}') + except Exception as _: + sys.stderr.write(f'Exception when communicating with the Polaris server.' + f' {e}{os.linesep}') + + @staticmethod + def _get_token(api_client: ApiClient, catalog_url, client_id, client_secret) -> str: + response = api_client.call_api( + 'POST', + f'{catalog_url}/oauth/tokens', + header_params={'Content-Type': 'application/x-www-form-urlencoded'}, + post_params={ + 'grant_type': 'client_credentials', + 'client_id': client_id, + 'client_secret': client_secret, + 'scope': 'PRINCIPAL_ROLE:ALL' + } + ).response.data + if 'access_token' not in json.loads(response): + raise Exception('Failed to get access token') + return json.loads(response)['access_token'] @staticmethod def _get_client_builder(options): @@ -52,21 +92,44 @@ def _get_client_builder(options): has_access_token = options.access_token is not None has_client_secret = options.client_id is not None and options.client_secret is not None if has_access_token and has_client_secret: - raise Exception("Please provide credentials via either --client-id / --client-secret or " - "--access-token, but not both") + raise Exception(f'Please provide credentials via either {Argument.to_flag_name(Arguments.CLIENT_ID)} &' + f' {Argument.to_flag_name(Arguments.CLIENT_SECRET)} or' + f' {Argument.to_flag_name(Arguments.ACCESS_TOKEN)}, but not both') # Authenticate accordingly - polaris_catalog_url = f'http://{options.host}:{options.port}/api/management/v1' + polaris_management_url = f'http://{options.host}:{options.port}/api/management/v1' + polaris_catalog_url = f'http://{options.host}:{options.port}/api/catalog/v1' + builder = None if has_access_token: - return lambda: ApiClient( - Configuration(host=polaris_catalog_url, access_token=options.access_token), + builder = lambda: ApiClient( + Configuration(host=polaris_management_url, access_token=options.access_token), ) elif has_client_secret: - return lambda: ApiClient( - Configuration(host=polaris_catalog_url, username=options.client_id, password=options.client_secret), + builder = lambda: ApiClient( + Configuration(host=polaris_management_url, username=options.client_id, password=options.client_secret), + ) + elif os.getenv('CLIENT_ID') and os.getenv('CLIENT_SECRET'): + builder = lambda: ApiClient( + Configuration( + host=polaris_management_url, + username=os.getenv(CLIENT_ID_ENV), + password=os.getenv(CLIENT_SECRET_ENV) + ) ) else: - raise Exception("Please provide credentials via --client-id & --client-secret or via --access-token") + raise Exception(f'Please provide credentials via either {Argument.to_flag_name(Arguments.CLIENT_ID)} &' + f' {Argument.to_flag_name(Arguments.CLIENT_SECRET)} or' + f' {Argument.to_flag_name(Arguments.ACCESS_TOKEN)}.' + f' Alternatively, you may set the environment variables {CLIENT_ID_ENV} &' + f' {CLIENT_SECRET_ENV}.') + + if not has_access_token and not PolarisCli.DIRECT_AUTHENTICATION_ENABLED: + token = PolarisCli._get_token(builder(), polaris_catalog_url, options.client_id, options.client_secret) + builder = lambda: ApiClient( + Configuration(host=polaris_management_url, access_token=token), + ) + return builder + if __name__ == '__main__': diff --git a/regtests/client/python/test/test_cli_parsing.py b/regtests/client/python/test/test_cli_parsing.py index 073c8ecd5d..e67b613a0a 100644 --- a/regtests/client/python/test/test_cli_parsing.py +++ b/regtests/client/python/test/test_cli_parsing.py @@ -102,18 +102,18 @@ def test_parsing_valid_commands(self): Parser.parse(['catalogs', 'get', 'catalog_name']) Parser.parse(['principals', 'list']) Parser.parse(['--host', 'some-host', 'catalogs', 'list']) - Parser.parse(['privileges', '--catalog', 'foo', '--catalog-role', 'bar', 'catalog', 'grant', 'TABLE_READ_DATA']) - Parser.parse(['privileges', '--catalog', 'foo', '--catalog-role', 'bar', 'table', 'grant', + Parser.parse(['privileges', 'catalog', 'grant', '--catalog', 'foo', '--catalog-role', 'bar', 'TABLE_READ_DATA']) + Parser.parse(['privileges', 'table', 'grant', '--catalog', 'foo', '--catalog-role', 'bar', '--namespace', 'n', '--table', 't', 'TABLE_READ_DATA']) - Parser.parse(['privileges', '--catalog', 'foo', '--catalog-role', 'bar', 'table', 'revoke', + Parser.parse(['privileges', 'table', 'revoke', '--catalog', 'foo', '--catalog-role', 'bar', '--namespace', 'n', '--table', 't', 'TABLE_READ_DATA']) # These commands are valid for parsing, but may cause errors within the command itself - def test_parse_valid_commands(self): + def test_parse_argparse_valid_commands(self): Parser.parse(['catalogs', 'create', 'catalog_name', '--type', 'internal', '--remote-url', 'www.apache.org']) Parser.parse(['privileges', 'table', 'grant', '--namespace', 'n', '--table', 't', 'TABLE_READ_DATA']) - Parser.parse(['privileges', '--catalog', 'c', '--catalog-role', 'r', 'catalog', 'grant', 'fake-privilege']) + Parser.parse(['privileges', 'catalog', 'grant', '--catalog', 'c', '--catalog-role', 'r', 'fake-privilege']) def test_commands(self): @@ -179,15 +179,12 @@ def get(obj, arg_string): '--storage-type') check_exception(lambda: mock_execute(['catalogs', 'create', 'my-catalog', '--storage-type', 'gcs']), '--default-base-location') - check_exception(lambda: mock_execute(['catalogs', 'create', 'my-catalog', '--type', 'external', - '--default-base-location', 'x', '--storage-type', 'gcs']), - '--remote-url') check_exception(lambda: mock_execute(['catalog-roles', 'get', 'foo']), '--catalog') check_exception(lambda: mock_execute(['catalogs', 'update', 'foo', '--property', 'bad-format']), 'bad-format') - check_exception(lambda: mock_execute(['privileges', '--catalog', 'foo', '--catalog-role', 'bar', - 'catalog', 'grant', 'TABLE_READ_MORE_BOOKS']), + check_exception(lambda: mock_execute(['privileges', 'catalog', 'grant', + '--catalog', 'foo', '--catalog-role', 'bar', 'TABLE_READ_MORE_BOOKS']), 'catalog privilege: TABLE_READ_MORE_BOOKS') check_exception(lambda: mock_execute(['catalogs', 'create', 'my-catalog', '--storage-type', 'gcs', '--allowed-location', 'a', '--allowed-location', 'b', @@ -214,7 +211,7 @@ def get(obj, arg_string): mock_execute([ 'catalogs', 'create', 'my-catalog', '--storage-type', 's3', '--allowed-location', 'a', '--allowed-location', 'b', '--role-arn', 'ra', - '--user-arn', 'ua', '--external-id', 'ei', '--default-base-location', 'x']), + '--external-id', 'ei', '--default-base-location', 'x']), 'create_catalog', { (0, 'catalog.name'): 'my-catalog', (0, 'catalog.storage_config_info.storage_type'): 'S3', @@ -234,10 +231,10 @@ def get(obj, arg_string): (0, None): 'foo', }) check_arguments( - mock_execute(['principals', 'create', 'foo', '--client-id', 'id', '--property', 'key=value']), + mock_execute(['principals', 'create', 'foo', '--property', 'key=value']), 'create_principal', { (0, 'principal.name'): 'foo', - (0, 'principal.client_id'): 'id', + (0, 'principal.client_id'): None, (0, 'principal.properties'): {'key': 'value'}, }) check_arguments( @@ -369,7 +366,7 @@ def get(obj, arg_string): }) check_arguments( mock_execute( - ['privileges', '--catalog', 'foo', '--catalog-role', 'bar', 'catalog', 'grant', 'TABLE_READ_DATA']), + ['privileges', 'catalog', 'grant', '--catalog', 'foo', '--catalog-role', 'bar', 'TABLE_READ_DATA']), 'add_grant_to_catalog_role', { (0, None): 'foo', (1, None): 'bar', @@ -377,7 +374,7 @@ def get(obj, arg_string): }) check_arguments( mock_execute( - ['privileges', '--catalog', 'foo', '--catalog-role', 'bar', 'catalog', 'revoke', 'TABLE_READ_DATA']), + ['privileges', 'catalog', 'revoke', '--catalog', 'foo', '--catalog-role', 'bar', 'TABLE_READ_DATA']), 'revoke_grant_from_catalog_role', { (0, None): 'foo', (1, None): 'bar', @@ -386,8 +383,8 @@ def get(obj, arg_string): }) check_arguments( mock_execute( - ['privileges', '--catalog', 'foo', '--catalog-role', 'bar', 'namespace', 'grant', '--namespace', 'a.b.c', - 'TABLE_READ_DATA']), + ['privileges', 'namespace', 'grant', '--namespace', 'a.b.c', '--catalog', 'foo', + '--catalog-role', 'bar', 'TABLE_READ_DATA']), 'add_grant_to_catalog_role', { (0, None): 'foo', (1, None): 'bar', @@ -396,8 +393,8 @@ def get(obj, arg_string): }) check_arguments( mock_execute( - ['privileges', '--catalog', 'foo', '--catalog-role', 'bar', 'table', 'grant', '--namespace', 'a.b.c', - '--table', 't', 'TABLE_READ_DATA']), + ['privileges', 'table', 'grant', '--namespace', 'a.b.c', + '--table', 't', '--catalog', 'foo', '--catalog-role', 'bar', 'TABLE_READ_DATA']), 'add_grant_to_catalog_role', { (0, None): 'foo', (1, None): 'bar', @@ -407,7 +404,7 @@ def get(obj, arg_string): }) check_arguments( mock_execute( - ['privileges', '--catalog', 'foo', '--catalog-role', 'bar', 'table', 'revoke', '--namespace', 'a.b.c', + ['privileges', 'table', 'revoke', '--namespace', 'a.b.c', '--catalog', 'foo', '--catalog-role', 'bar', '--table', 't', '--cascade', 'TABLE_READ_DATA']), 'revoke_grant_from_catalog_role', { (0, None): 'foo', @@ -419,7 +416,7 @@ def get(obj, arg_string): }) check_arguments( mock_execute( - ['privileges', '--catalog', 'foo', '--catalog-role', 'bar', 'view', 'grant', '--namespace', 'a.b.c', + ['privileges', 'view', 'grant', '--namespace', 'a.b.c', '--catalog', 'foo', '--catalog-role', 'bar', '--view', 'v', 'VIEW_CREATE']), 'add_grant_to_catalog_role', { (0, None): 'foo', diff --git a/regtests/polaris b/regtests/polaris new file mode 100755 index 0000000000..68e312c690 --- /dev/null +++ b/regtests/polaris @@ -0,0 +1,38 @@ +#!/bin/bash + +################################################ +# This is a modified copy of the script in the # +# parent directory, used for regression tests. # +################################################ + +SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd ) + +if [ ! -d ${SCRIPT_DIR}/polaris-venv ]; then + echo "Performing first-time setup for the Python client..." + python3 -m venv ${SCRIPT_DIR}/polaris-venv + . ${SCRIPT_DIR}/polaris-venv/bin/activate + pip install poetry==1.5.0 + + cp ${SCRIPT_DIR}/regtests/client/python/pyproject.toml ${SCRIPT_DIR} + + # Save the current directory + CURRENT_DIR=$(pwd) + cd $SCRIPT_DIR && poetry install + cd $CURRENT_DIR + + deactivate + echo "First time setup complete." +fi + +# Save the current directory +CURRENT_DIR=$(pwd) +cd $SCRIPT_DIR > /dev/null +PYTHONPATH=regtests/client/python ${SCRIPT_DIR}/polaris-venv/bin/python3 regtests/client/python/cli/polaris_cli.py "$@" +status=$? +cd $CURRENT_DIR > /dev/null + +if [ $status -ne 0 ]; then + exit 1 +fi + +exit 0 diff --git a/regtests/t_cli/src/test_cli.py b/regtests/t_cli/src/test_cli.py new file mode 100644 index 0000000000..a13df1bf4f --- /dev/null +++ b/regtests/t_cli/src/test_cli.py @@ -0,0 +1,366 @@ +import contextlib +import io +import json +import os +import random +import requests +import string +import subprocess +import sys +from typing import Callable + +CLI_PYTHONPATH = f'{os.path.dirname(os.path.abspath(__file__))}/../../client/python' +ROLE_ARN = 'arn:aws:iam::123456789012:role/my-role' + + +def get_salt(length=8) -> str: + characters = string.ascii_letters + string.digits + return ''.join(random.choice(characters) for _ in range(length)) + + +def root_cli(*args): + return cli('principal:root;realm:default-realm')(*args) + + +def cli(access_token): + def cli_inner(*args) -> Callable[[], str]: + def f() -> str: + result = subprocess.run([ + 'bash', + f'{CLI_PYTHONPATH}/../../../polaris', + '--access-token', + access_token, + '--host', + 'polaris', + *args], + capture_output=True, + text=True + ) + print(result) + if result.returncode != 0: + raise Exception(result.stderr) + return result.stdout + return f + return cli_inner + + +def check_output(f, checker: Callable[[str], bool]): + assert checker(f()) + + +def check_exception(f, exception_str): + throws = True + try: + f() + throws = False + except Exception as e: + assert exception_str in str(e) + assert throws + + +def get_token(client_id, client_secret): + url = 'http://polaris:8181/api/catalog/v1/oauth/tokens' + data = { + 'grant_type': 'client_credentials', + 'client_id': client_id, + 'client_secret': client_secret, + 'scope': 'PRINCIPAL_ROLE:ALL' + } + + response = requests.post(url, data=data) + + if response.status_code != 200 or 'access_token' not in response.json(): + raise Exception("Failed to retrieve token") + + return response.json()['access_token'] + + +def test_quickstart_flow(): + """ + Basic CLI test - create a catalog, create a principal, and grant the principal access to the catalog. + """ + + SALT = get_salt() + sys.path.insert(0, CLI_PYTHONPATH) + try: + + # Create a catalog: + check_output(root_cli( + 'catalogs', + 'create', + '--storage-type', + 's3', + '--role-arn', + ROLE_ARN, + '--default-base-location', + f's3://fake-location-{SALT}', + f'test_cli_catalog_{SALT}'), checker=lambda s: s == '') + check_output(root_cli('catalogs', 'list'), + checker=lambda s: f'test_cli_catalog_{SALT}' in s) + check_output(root_cli('catalogs', 'get', f'test_cli_catalog_{SALT}'), + checker=lambda s: 's3://fake-location' in s) + + # Create a new user: + credentials = root_cli('principals', 'create', f'test_cli_user_{SALT}')() + check_output(root_cli('principals', 'list'), checker=lambda s: f'test_cli_user_{SALT}' in s) + credentials = json.loads(credentials) + assert 'clientId' in credentials + assert 'clientSecret' in credentials + user_token = get_token(credentials['clientId'], credentials['clientSecret']) + + # User initially has no catalog access: + check_exception(cli(user_token)('catalogs', 'get', f'test_cli_catalog_{SALT}'), + exception_str='not authorized') + + # Grant user access: + check_output( + root_cli('principal-roles', 'create', f'test_cli_p_role_{SALT}'), + checker=lambda s: s == '') + check_output(root_cli( + 'catalog-roles', 'create', '--catalog', f'test_cli_catalog_{SALT}', f'test_cli_c_role_{SALT}'), + checker=lambda s: s == '') + check_output(root_cli( + 'principal-roles', 'grant', '--principal', f'test_cli_user_{SALT}', f'test_cli_p_role_{SALT}'), + checker=lambda s: s == '') + check_output(root_cli( + 'catalog-roles', + 'grant', + '--catalog', + f'test_cli_catalog_{SALT}', + '--principal-role', + f'test_cli_p_role_{SALT}', + f'test_cli_c_role_{SALT}' + ), checker=lambda s: s == '') + check_output(root_cli( + 'privileges', + 'catalog', + 'grant', + '--catalog', + f'test_cli_catalog_{SALT}', + '--catalog-role', + f'test_cli_c_role_{SALT}', + f'CATALOG_MANAGE_CONTENT' + ), checker=lambda s: s == '') + + # User now has catalog access: + check_output(cli(user_token)('catalogs', 'get', f'test_cli_catalog_{SALT}'), + checker=lambda s: 's3://fake-location' in s) + check_output(cli(user_token)( + 'namespaces', + 'create', + '--catalog', + f'test_cli_catalog_{SALT}', + f'test_cli_namespace_{SALT}' + ), checker=lambda s: s == '') + check_output(cli(user_token)('namespaces', 'list', '--catalog', f'test_cli_catalog_{SALT}'), + checker=lambda s: f'test_cli_namespace_{SALT}' in s) + check_output(cli(user_token)( + 'namespaces', + 'delete', + '--catalog', + f'test_cli_catalog_{SALT}', + f'test_cli_namespace_{SALT}' + ), checker=lambda s: s == '') + check_output(cli(user_token)('namespaces', 'list', '--catalog', f'test_cli_catalog_{SALT}'), + checker=lambda s: f'test_cli_namespace_{SALT}' not in s) + + finally: + sys.path.pop(0) + pass + + +def test_nested_namespace(): + """ + Test creating and managing deeply nested namespaces. + """ + + SALT = get_salt() + sys.path.insert(0, CLI_PYTHONPATH) + try: + + # Create a catalog: + check_output(root_cli( + 'catalogs', + 'create', + '--storage-type', + 's3', + '--role-arn', + ROLE_ARN, + '--default-base-location', + f's3://fake-location-{SALT}', + f'test_cli_catalog_{SALT}'), checker=lambda s: s == '') + check_output(root_cli('catalogs', 'list'), + checker=lambda s: f'test_cli_catalog_{SALT}' in s) + check_output(root_cli('catalogs', 'get', f'test_cli_catalog_{SALT}'), + checker=lambda s: 's3://fake-location' in s) + + # Create some namespaces: + check_output(root_cli( + 'namespaces', + 'create', + '--catalog', + f'test_cli_catalog_{SALT}', + f'a_{SALT}' + ), checker=lambda s: s == '') + check_output(root_cli( + 'namespaces', + 'create', + '--catalog', + f'test_cli_catalog_{SALT}', + f'a_{SALT}.b_{SALT}' + ), checker=lambda s: s == '') + check_output(root_cli( + 'namespaces', + 'create', + '--catalog', + f'test_cli_catalog_{SALT}', + f'a_{SALT}.b_{SALT}.c_{SALT}' + ), checker=lambda s: s == '') + + # List namespaces: + check_output(root_cli('namespaces', 'list', '--catalog', f'test_cli_catalog_{SALT}'), + checker=lambda s: f'a_{SALT}' in s and f'b_{SALT}' not in s) + check_output(root_cli( + 'namespaces', + 'list', + '--catalog', + f'test_cli_catalog_{SALT}', + '--parent', + f'a_{SALT}' + ), checker=lambda s: f'a_{SALT}.b_{SALT}' in s) + + # a.b.c exists, and a.b can't be deleted while non-empty + check_output(root_cli( + 'namespaces', + 'get', + '--catalog', + f'test_cli_catalog_{SALT}', + f'a_{SALT}.b_{SALT}.c_{SALT}' + ), checker=lambda s: f'a_{SALT}.b_{SALT}.c_{SALT}' in s) + check_exception(root_cli( + 'namespaces', + 'delete', + '--catalog', + f'test_cli_catalog_{SALT}', + f'a_{SALT}.b_{SALT}' + ), exception_str='not empty') + finally: + sys.path.pop(0) + pass + + +def test_list_privileges(): + """ + Test creating and managing deeply nested namespaces. + """ + + SALT = get_salt() + sys.path.insert(0, CLI_PYTHONPATH) + try: + + # Create a catalog, namespace, and principal: + check_output(root_cli( + 'catalogs', + 'create', + '--storage-type', + 's3', + '--role-arn', + ROLE_ARN, + '--default-base-location', + f's3://fake-location-{SALT}', + f'test_cli_catalog_{SALT}'), checker=lambda s: s == '') + check_output(root_cli( + 'namespaces', + 'create', + '--catalog', + f'test_cli_catalog_{SALT}', + f'a_{SALT}' + ), checker=lambda s: s == '') + check_output(root_cli( + 'principals', + 'create', + f'test_cli_user_{SALT}' + ), checker=lambda s: s != '') + + # Grant the principal some privileges: + check_output( + root_cli('principal-roles', 'create', f'test_cli_p_role_{SALT}'), + checker=lambda s: s == '') + check_output(root_cli( + 'catalog-roles', 'create', '--catalog', f'test_cli_catalog_{SALT}', f'test_cli_c_role_{SALT}'), + checker=lambda s: s == '') + check_output(root_cli( + 'principal-roles', 'grant', '--principal', f'test_cli_user_{SALT}', f'test_cli_p_role_{SALT}'), + checker=lambda s: s == '') + check_output(root_cli( + 'catalog-roles', + 'grant', + '--catalog', + f'test_cli_catalog_{SALT}', + '--principal-role', + f'test_cli_p_role_{SALT}', + f'test_cli_c_role_{SALT}' + ), checker=lambda s: s == '') + check_output(root_cli( + 'privileges', + 'catalog', + 'grant', + '--catalog', + f'test_cli_catalog_{SALT}', + '--catalog-role', + f'test_cli_c_role_{SALT}', + f'TABLE_READ_DATA' + ), checker=lambda s: s == '') + check_output(root_cli( + 'privileges', + 'namespace', + 'grant', + '--catalog', + f'test_cli_catalog_{SALT}', + '--catalog-role', + f'test_cli_c_role_{SALT}', + '--namespace', + f'a_{SALT}', + f'TABLE_WRITE_DATA' + ), checker=lambda s: s == '') + check_output(root_cli( + 'privileges', + 'namespace', + 'grant', + '--catalog', + f'test_cli_catalog_{SALT}', + '--catalog-role', + f'test_cli_c_role_{SALT}', + '--namespace', + f'a_{SALT}', + f'TABLE_LIST' + ), checker=lambda s: s == '') + + # List privileges: + check_output(root_cli( + 'privileges', + 'list', + '--catalog', + f'test_cli_catalog_{SALT}', + '--catalog-role', + f'test_cli_c_role_{SALT}' + ), checker=lambda s: len(s.strip().split('\n')) == 3) + + finally: + sys.path.pop(0) + pass + + +def test_invalid_commands(): + sys.path.insert(0, CLI_PYTHONPATH) + try: + check_exception(root_cli('catalogs', 'create', 'test_catalog'), exception_str='--storage-type') + check_exception(root_cli( + 'catalogs', + 'create', + 'test_catalog', + '--storage-type', + 'not-real!' + ), exception_str='--storage-type') + finally: + sys.path.pop(0)