Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tidb-cloud: add Azure Blob Storage access configuration for Dedicated #19090

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion TOC-tidb-cloud.md
Original file line number Diff line number Diff line change
Expand Up @@ -232,7 +232,7 @@
- [Import Apache Parquet Files from Amazon S3 or GCS](/tidb-cloud/import-parquet-files.md)
- [Import with MySQL CLI](/tidb-cloud/import-with-mysql-cli.md)
- Reference
- [Configure Amazon S3 Access and GCS Access](/tidb-cloud/config-s3-and-gcs-access.md)
- [Configure Amazon S3 Access and GCS Access](/tidb-cloud/dedicated-external-storage.md)
- [Naming Conventions for Data Import](/tidb-cloud/naming-conventions-for-data-import.md)
- [CSV Configurations for Importing Data](/tidb-cloud/csv-config-for-import-data.md)
- [Troubleshoot Access Denied Errors during Data Import from Amazon S3](/tidb-cloud/troubleshoot-import-access-denied-error.md)
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not recommended to change the name of the document, as it will be referenced in multiple places on the page
image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this PR, we’ve added an alias for the old doc link as follows so that anyone clicking the old doc link will be redirected to the new link. However, it’s still recommended to update the link in the Cloud console for consistency after the changes in this PR take effect on the docs website.

image

Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
---
title: Configure External Storage Access for TiDB Cloud Dedicated
summary: Learn how to configure Amazon Simple Storage Service (Amazon S3) access and Google Cloud Storage (GCS) access.
aliases: ['/tidb-cloud/config-s3-and-gcs-access']
---

# Configure External Storage Access for TiDB Cloud Dedicated
Expand Down Expand Up @@ -221,3 +222,33 @@ To allow TiDB Cloud to access the source data in your GCS bucket, you need to co
![Get bucket URI](/media/tidb-cloud/gcp-bucket-uri02.png)

7. In the TiDB Cloud console, go to the **Data Import** page where you get the Google Cloud Service Account ID, and then paste the GCS bucket gsutil URI to the **Bucket gsutil URI** field. For example, paste `gs://tidb-cloud-source-data/`.

## Configure Azure Blob Storage access

To allow TiDB Cloud Dedicated to access your Azure Blob container, you need to configure the Azure Blob access for the container. You can use a service SAS token to configure the container access:

1. On the [Azure Storage account](https://portal.azure.com/#browse/Microsoft.Storage%2FStorageAccounts) page, click your storage account to which the container belongs.

2. On your **Storage account** page, click **Security+network** in the left navigation pane, and then click **Shared access signature**.

![sas-position](/media/tidb-cloud/dedicated-external-storage/azure-sas-position.png)

3. On the **Shared access signature** page, create a service SAS token with the necessary permissions as follows. For more information, see [Create a service SAS token](https://docs.microsoft.com/en-us/azure/storage/common/storage-sas-overview).

1. In the **Allowed services** section, choose the **Blob** service.
2. In the **Allowed Resource types** section, choose **Container** and **Object**.
3. In the **Allowed permissions** section, choose the permissions as needed. For example, importing data to a TiDB Cloud Dedicated cluster needs the **Read** and **List** permissions.
4. Adjust **Start and expiry date/time** as needed. For security reasons, it's recommended to set an expiration date that aligns with your data import timeline.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
4. Adjust **Start and expiry date/time** as needed. For security reasons, it's recommended to set an expiration date that aligns with your data import timeline.
4. Adjust **Start and expiry date/time** as needed. For security reasons, it is recommended to set an expiration date that aligns with your data import timeline.

5. You can keep the default values for other settings.

![sas-create](/media/tidb-cloud/dedicated-external-storage/azure-sas-create.png)

4. Click **Generate SAS and connection string** to generate the SAS token.

5. Copy the generated **SAS Token**. You will need this token string when configuring the data import in TiDB Cloud.

> **Note:**
>
> TiDB Cloud does not store your SAS token. It is recommended that you revoke or delete the SAS token after the import is complete to ensure the security of your Azure Blob Storage.

Remember to test the connection and permissions before starting your data import to ensure TiDB Cloud Dedicated can access the specified Azure Blob container and files.
Comment on lines +252 to +254
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
> TiDB Cloud does not store your SAS token. It is recommended that you revoke or delete the SAS token after the import is complete to ensure the security of your Azure Blob Storage.
Remember to test the connection and permissions before starting your data import to ensure TiDB Cloud Dedicated can access the specified Azure Blob container and files.
> - Before starting your data import, it is recommended that you test the connection and permissions to ensure that TiDB Cloud Dedicated can access the specified Azure Blob container and files.
> - TiDB Cloud does not store your SAS token. After the import is complete, it is recommended that you revoke or delete the SAS token to ensure the security of your Azure Blob Storage.

8 changes: 4 additions & 4 deletions tidb-cloud/import-csv-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,11 +80,11 @@ Because CSV files do not contain schema information, before importing data from

To allow TiDB Cloud to access the CSV files in the Amazon S3 or GCS bucket, do one of the following:

- If your CSV files are located in Amazon S3, [configure Amazon S3 access](/tidb-cloud/config-s3-and-gcs-access.md#configure-amazon-s3-access).
- If your CSV files are located in Amazon S3, [configure Amazon S3 access](/tidb-cloud/dedicated-external-storage.md#configure-amazon-s3-access).

You can use either an AWS access key or a Role ARN to access your bucket. Once finished, make a note of the access key (including the access key ID and secret access key) or the Role ARN value as you will need it in [Step 4](#step-4-import-csv-files-to-tidb-cloud).

- If your CSV files are located in GCS, [configure GCS access](/tidb-cloud/config-s3-and-gcs-access.md#configure-gcs-access).
- If your CSV files are located in GCS, [configure GCS access](/tidb-cloud/dedicated-external-storage.md#configure-gcs-access).

## Step 4. Import CSV files to TiDB Cloud

Expand Down Expand Up @@ -115,7 +115,7 @@ To import the CSV files to TiDB Cloud, take the following steps:
- **File URI** or **Folder URI**:
- When importing one file, enter the source file URI and name in the following format `s3://[bucket_name]/[data_source_folder]/[file_name].csv`. For example, `s3://sampledata/ingest/TableName.01.csv`.
- When importing multiple files, enter the source file URI and name in the following format `s3://[bucket_name]/[data_source_folder]/`. For example, `s3://sampledata/ingest/`.
- **Bucket Access**: you can use either an AWS Role ARN or an AWS access key to access your bucket. For more information, see [Configure Amazon S3 access](/tidb-cloud/config-s3-and-gcs-access.md#configure-amazon-s3-access).
- **Bucket Access**: you can use either an AWS Role ARN or an AWS access key to access your bucket. For more information, see [Configure Amazon S3 access](/tidb-cloud/dedicated-external-storage.md#configure-amazon-s3-access).
- **AWS Role ARN**: enter the AWS Role ARN value.
- **AWS Access Key**: enter the AWS access key ID and AWS secret access key.

Expand Down Expand Up @@ -169,7 +169,7 @@ To import the CSV files to TiDB Cloud, take the following steps:
- **File URI** or **Folder URI**:
- When importing one file, enter the source file URI and name in the following format `gs://[bucket_name]/[data_source_folder]/[file_name].csv`. For example, `gs://sampledata/ingest/TableName.01.csv`.
- When importing multiple files, enter the source file URI and name in the following format `gs://[bucket_name]/[data_source_folder]/`. For example, `gs://sampledata/ingest/`.
- **Bucket Access**: you can use a GCS IAM Role to access your bucket. For more information, see [Configure GCS access](/tidb-cloud/config-s3-and-gcs-access.md#configure-gcs-access).
- **Bucket Access**: you can use a GCS IAM Role to access your bucket. For more information, see [Configure GCS access](/tidb-cloud/dedicated-external-storage.md#configure-gcs-access).

4. Click **Connect**.

Expand Down
8 changes: 4 additions & 4 deletions tidb-cloud/import-parquet-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,11 +86,11 @@ Because Parquet files do not contain schema information, before importing data f

To allow TiDB Cloud to access the Parquet files in the Amazon S3 or GCS bucket, do one of the following:

- If your Parquet files are located in Amazon S3, [configure Amazon S3 access](/tidb-cloud/config-s3-and-gcs-access.md#configure-amazon-s3-access).
- If your Parquet files are located in Amazon S3, [configure Amazon S3 access](/tidb-cloud/dedicated-external-storage.md#configure-amazon-s3-access).

You can use either an AWS access key or a Role ARN to access your bucket. Once finished, make a note of the access key (including the access key ID and secret access key) or the Role ARN value as you will need it in [Step 4](#step-4-import-parquet-files-to-tidb-cloud).

- If your Parquet files are located in GCS, [configure GCS access](/tidb-cloud/config-s3-and-gcs-access.md#configure-gcs-access).
- If your Parquet files are located in GCS, [configure GCS access](/tidb-cloud/dedicated-external-storage.md#configure-gcs-access).

## Step 4. Import Parquet files to TiDB Cloud

Expand Down Expand Up @@ -121,7 +121,7 @@ To import the Parquet files to TiDB Cloud, take the following steps:
- **File URI** or **Folder URI**:
- When importing one file, enter the source file URI and name in the following format `s3://[bucket_name]/[data_source_folder]/[file_name].parquet`. For example, `s3://sampledata/ingest/TableName.01.parquet`.
- When importing multiple files, enter the source file URI and name in the following format `s3://[bucket_name]/[data_source_folder]/`. For example, `s3://sampledata/ingest/`.
- **Bucket Access**: you can use either an AWS Role ARN or an AWS access key to access your bucket. For more information, see [Configure Amazon S3 access](/tidb-cloud/config-s3-and-gcs-access.md#configure-amazon-s3-access).
- **Bucket Access**: you can use either an AWS Role ARN or an AWS access key to access your bucket. For more information, see [Configure Amazon S3 access](/tidb-cloud/dedicated-external-storage.md#configure-amazon-s3-access).
- **AWS Role ARN**: enter the AWS Role ARN value.
- **AWS Access Key**: enter the AWS access key ID and AWS secret access key.

Expand Down Expand Up @@ -175,7 +175,7 @@ To import the Parquet files to TiDB Cloud, take the following steps:
- **File URI** or **Folder URI**:
- When importing one file, enter the source file URI and name in the following format `gs://[bucket_name]/[data_source_folder]/[file_name].parquet`. For example, `gs://sampledata/ingest/TableName.01.parquet`.
- When importing multiple files, enter the source file URI and name in the following format `gs://[bucket_name]/[data_source_folder]/`. For example, `gs://sampledata/ingest/`.
- **Bucket Access**: you can use a GCS IAM Role to access your bucket. For more information, see [Configure GCS access](/tidb-cloud/config-s3-and-gcs-access.md#configure-gcs-access).
- **Bucket Access**: you can use a GCS IAM Role to access your bucket. For more information, see [Configure GCS access](/tidb-cloud/dedicated-external-storage.md#configure-gcs-access).

4. Click **Connect**.

Expand Down
2 changes: 1 addition & 1 deletion tidb-cloud/import-sample-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ This document describes how to import the sample data into TiDB Cloud via the UI
- To import into pre-created tables, select **No**. This enables you to create tables in TiDB in advance and select the tables that you want to import data into. In this case, you can choose up to 1000 tables to import. You can click **SQL Editor** in the left navigation pane to create tables. For more information about how to use SQL Editor, see [Explore your data with AI-assisted SQL Editor](/tidb-cloud/explore-data-with-chat2query.md).
- **Data Format**: select **SQL**. TiDB Cloud supports importing compressed files in the following formats: `.gzip`, `.gz`, `.zstd`, `.zst` and `.snappy`. If you want to import compressed SQL files, name the files in the `${db_name}.${table_name}.${suffix}.sql.${compress}` format, in which `${suffix}` is optional and can be any integer such as '000001'. For example, if you want to import the `trips.000001.sql.gz` file to the `bikeshare.trips` table, you can rename the file as `bikeshare.trips.000001.sql.gz`. Note that you only need to compress the data files, not the database or table schema files. Note that you only need to compress the data files, not the database or table schema files. The Snappy compressed file must be in the [official Snappy format](https://github.com/google/snappy). Other variants of Snappy compression are not supported.
- **Folder URI** or **File URI**: enter the sample data URI `gs://tidbcloud-samples-us-west1/`.
- **Bucket Access**: you can use a GCS IAM Role to access your bucket. For more information, see [Configure GCS access](/tidb-cloud/config-s3-and-gcs-access.md#configure-gcs-access).
- **Bucket Access**: you can use a GCS IAM Role to access your bucket. For more information, see [Configure GCS access](/tidb-cloud/dedicated-external-storage.md#configure-gcs-access).

If the region of the bucket is different from your cluster, confirm the compliance of cross region.

Expand Down
2 changes: 1 addition & 1 deletion tidb-cloud/migrate-sql-shards.md
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,7 @@ For more information about the solutions to solve such conflicts, see [Remove th

### Step 4. Configure Amazon S3 access

Follow the instructions in [Configure Amazon S3 access](/tidb-cloud/config-s3-and-gcs-access.md#configure-amazon-s3-access) to get the role ARN to access the source data.
Follow the instructions in [Configure Amazon S3 access](/tidb-cloud/dedicated-external-storage.md#configure-amazon-s3-access) to get the role ARN to access the source data.

The following example only lists key policy configurations. Replace the Amazon S3 path with your own values.

Expand Down
2 changes: 1 addition & 1 deletion tidb-cloud/release-notes-2023.md
Original file line number Diff line number Diff line change
Expand Up @@ -871,7 +871,7 @@ This page lists the release notes of [TiDB Cloud](https://www.pingcap.com/tidb-c

- Support using the AWS access keys of an IAM user to access your Amazon S3 bucket when importing data to TiDB Cloud.

This method is simpler than using Role ARN. For more information, refer to [Configure Amazon S3 access](/tidb-cloud/config-s3-and-gcs-access.md#configure-amazon-s3-access).
This method is simpler than using Role ARN. For more information, refer to [Configure Amazon S3 access](/tidb-cloud/dedicated-external-storage.md#configure-amazon-s3-access).

- Extend the [monitoring metrics retention period](/tidb-cloud/built-in-monitoring.md#metrics-retention-policy) from 2 days to a longer period:

Expand Down
2 changes: 1 addition & 1 deletion tidb-cloud/serverless-external-storage.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ summary: Learn how to configure Amazon Simple Storage Service (Amazon S3) access

If you want to import data from or export data to an external storage in a TiDB Cloud Serverless cluster, you need to configure cross-account access. This document describes how to configure access to an external storage for TiDB Cloud Serverless clusters.

If you need to configure these external storages for a TiDB Cloud Dedicated cluster, see [Configure External Storage Access for TiDB Cloud Dedicated](/tidb-cloud/config-s3-and-gcs-access.md).
If you need to configure these external storages for a TiDB Cloud Dedicated cluster, see [Configure External Storage Access for TiDB Cloud Dedicated](/tidb-cloud/dedicated-external-storage.md).

## Configure Amazon S3 access

Expand Down
2 changes: 1 addition & 1 deletion tidb-cloud/terraform-use-import-resource.md
Original file line number Diff line number Diff line change
Expand Up @@ -189,7 +189,7 @@ You can manage either a local import task or an Amazon S3 import task using the

> **Note:**
>
> To allow TiDB Cloud to access your files in the Amazon S3 bucket, you need to [configure Amazon S3 access](/tidb-cloud/config-s3-and-gcs-access.md#configure-amazon-s3-access) first.
> To allow TiDB Cloud to access your files in the Amazon S3 bucket, you need to [configure Amazon S3 access](/tidb-cloud/dedicated-external-storage.md#configure-amazon-s3-access) first.

1. Create an `import` directory, and then create a `main.tf` inside it. For example:

Expand Down
2 changes: 1 addition & 1 deletion tidb-cloud/tidb-cloud-migration-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ If you have data files in SQL, CSV, Parquet, or Aurora Snapshot formats, you can

### Configure Amazon S3 access and GCS access

If your source data is stored in Amazon S3 or Google Cloud Storage (GCS) buckets, before importing or migrating the data to TiDB Cloud, you need to configure access to the buckets. For more information, see [Configure Amazon S3 access and GCS access](/tidb-cloud/config-s3-and-gcs-access.md).
If your source data is stored in Amazon S3 or Google Cloud Storage (GCS) buckets, before importing or migrating the data to TiDB Cloud, you need to configure access to the buckets. For more information, see [Configure Amazon S3 access and GCS access](/tidb-cloud/dedicated-external-storage.md).

### Naming conventions for data import

Expand Down
2 changes: 1 addition & 1 deletion tidb-cloud/troubleshoot-import-access-denied-error.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ In the sample trust entity:

### Check whether the IAM role exists

If the IAM role does not exist, create a role following instructions in [Configure Amazon S3 access](/tidb-cloud/config-s3-and-gcs-access.md#configure-amazon-s3-access).
If the IAM role does not exist, create a role following instructions in [Configure Amazon S3 access](/tidb-cloud/dedicated-external-storage.md#configure-amazon-s3-access).

### Check whether the external ID is set correctly

Expand Down
Loading