-
Notifications
You must be signed in to change notification settings - Fork 459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Locality-aware backup description edit for node locality #18269
Conversation
✅ Deploy Preview for cockroachdb-interactivetutorials-docs canceled.
|
✅ Deploy Preview for cockroachdb-api-docs canceled.
|
✅ Netlify Preview
To edit notification comments on pull requests, go to your Netlify site configuration. |
8aa7b6e
to
b75f611
Compare
@@ -5,10 +5,10 @@ toc: true | |||
docs_area: manage | |||
--- | |||
|
|||
CockroachDB backups operate as _jobs_, which are potentially long-running operations that could span multiple SQL sessions. Unlike regular SQL statements, which CockroachDB routes to the [optimizer](cost-based-optimizer.html) for processing, a [`BACKUP`](backup.html) statement will move into a job workflow. A backup job has four main phases: | |||
CockroachDB backups operate as _jobs_, which are potentially long-running operations that could span multiple SQL sessions. Unlike regular SQL statements, which CockroachDB routes to the [optimizer](cost-based-optimizer.html) for processing, a [`BACKUP`](backup.html) statement will move into a job workflow. A backup job has four main phases: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: No new content here until LINE 94
@@ -91,6 +91,30 @@ The backup metadata files describe everything a backup contains. That is, all th | |||
|
|||
With the full backup complete, the specified storage location will contain the backup data and its metadata ready for a potential [restore](restore.html). After subsequent backups of the `movr` database to this storage location, CockroachDB will create a _backup collection_. See [Backup collections](take-full-and-incremental-backups.html#backup-collections) for information on how CockroachDB structures a collection of multiple backups. | |||
|
|||
## Backup jobs with locality |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
v22.2 did not have any technical detail on locality-aware backups, this is taken from v23.1 / v23.2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to update 22.2? It's EOL, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite... in June it's EOL. I was really playing it safe here updating v22.2 as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, it's EOL for me but not for thee. :) Thank you!
Thanks, Kathryn. I like the approach you've taken with this PR in terms of clear but subtle warnings that users consider the current behavior carefully if they have data domiciling requirements. |
@@ -0,0 +1 @@ | |||
A successful locality-aware backup job requires that each node in the cluster has access to each storage location. This is because any node in the cluster can claim the job and become the [_coordinator_ ](backup-architecture.html#job-creation-phase) node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A successful locality-aware backup job requires that each node in the cluster has access to each storage location. This is because any node in the cluster can claim the job and become the [_coordinator_ ](backup-architecture.html#job-creation-phase) node. | |
A successful locality-aware backup job requires that each node in the cluster has access to each storage location. This is because any node in the cluster can claim the job and become the [_coordinator_](backup-architecture.html#job-creation-phase) node. |
|
||
Automated database and table level backups are not supported in CockroachDB {{ site.data.products.serverless }}. However, [user managed database and table level backups]({% link cockroachcloud/take-and-restore-customer-owned-backups.md %}#back-up-data) using user provided storage locations are supported. | ||
Automated database and table level backups are not supported in CockroachDB {{ site.data.products.serverless }}. However, you can take manual [database and table level backups]({% link cockroachcloud/take-and-restore-customer-owned-backups.md %}#back-up-data) to your own [cloud storage location](https://www.cockroachlabs.com/docs/{{site.current_cloud_version}}/use-cloud-storage). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I understand why we don't offer backing up to local storage as an option, but there's nothing that says they have to save the backup to cloud storage, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They could conceivably back up to their own storage, provided it's accessible from cloud. Not sure the distinction between that and "your own cloud storage location" is super-sharp? But y'alls department, obviously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, I changed this as the language on this page was really stilted. I didn't think about other storage tbh, just automatically went with cloud storage as that is what we recommend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New language looks great to me. Some comments. Please wait also for @stevendanna 's review for accuracy - thank you!
@@ -0,0 +1 @@ | |||
CockroachDB {{ site.data.products.serverless }} clusters operate with a [different architecture]({% link cockroachcloud/architecture.md %}#cockroachdb-serverless) compared to CockroachDB {{ site.data.products.core }} and CockroachDB {{ site.data.products.dedicated }} clusters. These architectural differences have implications for how locality-aware backups can run. Serverless clusters will scale resources depending on whether they are actively in use, which means that it is less likely to have a SQL pod available in every locality. As a result, Serverless clusters are more likely to have ranges that do not match with any of the cluster's localities, which can lead to more ranges backed up to a storage bucket in a different locality. You should consider this as you plan a backup strategy that must comply with [data domiciling]({% link v23.2/data-domiciling.md %}) requirements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this fit better in Cloud documentation? Does "Serverless" apply to self-hosted at all? (I don't think it does, but this is all subtle. Perhaps Steven will correct me.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I have added it to the unsupported features page in this PR. But, right now the information architecture for backup/restore in cloud needs some work (which I am working on this Breather Week). I will add this include to that work when there is a good space on the new Cloud docs I am working on.
|
||
Automated database and table level backups are not supported in CockroachDB {{ site.data.products.serverless }}. However, [user managed database and table level backups]({% link cockroachcloud/take-and-restore-customer-owned-backups.md %}#back-up-data) using user provided storage locations are supported. | ||
Automated database and table level backups are not supported in CockroachDB {{ site.data.products.serverless }}. However, you can take manual [database and table level backups]({% link cockroachcloud/take-and-restore-customer-owned-backups.md %}#back-up-data) to your own [cloud storage location](https://www.cockroachlabs.com/docs/{{site.current_cloud_version}}/use-cloud-storage). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They could conceivably back up to their own storage, provided it's accessible from cloud. Not sure the distinction between that and "your own cloud storage location" is super-sharp? But y'alls department, obviously.
@@ -91,6 +91,30 @@ The backup metadata files describe everything a backup contains. That is, all th | |||
|
|||
With the full backup complete, the specified storage location will contain the backup data and its metadata ready for a potential [restore](restore.html). After subsequent backups of the `movr` database to this storage location, CockroachDB will create a _backup collection_. See [Backup collections](take-full-and-incremental-backups.html#backup-collections) for information on how CockroachDB structures a collection of multiple backups. | |||
|
|||
## Backup jobs with locality |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to update 22.2? It's EOL, right?
src/current/_includes/v23.2/backups/serverless-locality-aware.md
Outdated
Show resolved
Hide resolved
- [Locality-restricted backup execution](#job-coordination-using-the-execution-locality-option): Specify a set of locality filters for a backup job in order to restrict the nodes that can participate in the backup process to that locality. This ensures that the backup job is executed by nodes that meet certain requirements, such as being located in a specific region or having access to a certain storage bucket. | ||
|
||
### Job coordination and export of locality-aware backups | ||
|
||
When you create a [locality-aware backup]({% link {{ page.version.version }}/take-and-restore-locality-aware-backups.md %}) job, any node in the cluster can [claim the backup job](#job-creation-phase). A successful locality-aware backup job requires that each node in the cluster has access to each storage location. This is because any node in the cluster can claim the job and become the coordinator node. Once each node informs the coordinator node that it has completed exporting the row data, the coordinator will start to write metadata, which involves writing to each locality bucket a partial manifest recording what row data was written to that [storage bucket]({% link {{ page.version.version }}/use-cloud-storage.md %}). | ||
|
||
Every node involved in the backup is responsible for backing up the ranges for which it was the [leaseholder]({% link {{ page.version.version }}/architecture/replication-layer.md %}#leases) at the time the coordinator planned the [distributed backup flow]({% link {{ page.version.version }}/backup-architecture.md %}#resolution-phase). The locality of the node ([configured at node startup]({% link {{ page.version.version }}/cockroach-start.md %}#locality)) exporting the row data determines where the backups files will be placed in a locality-aware backup. | ||
Every node involved in the backup is responsible for backing up the ranges for which it was the [leaseholder]({% link {{ page.version.version }}/architecture/replication-layer.md %}#leases) at the time the coordinator planned the [distributed backup flow]({% link {{ page.version.version }}/backup-architecture.md %}#resolution-phase). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that in 23.2 at least we can do follower-reads so it is now possible that we send the work to any replica. This is a pretty detailed document already, how deep do we want to go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for this one — I would like to open a separate issue. The architecture page has been up for a couple of releases at least now, so there is likely some overhauling to be done here... starting with an audit. Is that OK with you @stevendanna, or do you think we should correct some of these passages in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm OK with that too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reference: https://cockroachlabs.atlassian.net/browse/DOC-9675
e8f9219
to
065aeb2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM pending a question and some suggestions
|
||
Automated database and table level backups are not supported in CockroachDB {{ site.data.products.serverless }}. However, [user managed database and table level backups]({% link cockroachcloud/take-and-restore-customer-owned-backups.md %}#back-up-data) using user provided storage locations are supported. | ||
Automated database and table level backups are not supported in CockroachDB {{ site.data.products.serverless }}. However, you can take manual [database and table level backups]({% link cockroachcloud/take-and-restore-customer-owned-backups.md %}#back-up-data) to your own [cloud storage location](https://www.cockroachlabs.com/docs/{{site.current_cloud_version}}/use-cloud-storage). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Automated database and table level backups are not supported in CockroachDB {{ site.data.products.serverless }}. However, you can take manual [database and table level backups]({% link cockroachcloud/take-and-restore-customer-owned-backups.md %}#back-up-data) to your own [cloud storage location](https://www.cockroachlabs.com/docs/{{site.current_cloud_version}}/use-cloud-storage). | |
Automated database and table level backups are not supported in CockroachDB {{ site.data.products.serverless }}. However, you can take manual [database and table level backups]({% link cockroachcloud/take-and-restore-customer-owned-backups.md %}?filters=cloud#back-up-data) to your own [cloud storage location](https://www.cockroachlabs.com/docs/{{site.current_cloud_version}}/use-cloud-storage). |
Fix link to have ?filters=cloud
because default is userfile
065aeb2
to
c9ff721
Compare
bc34ffc
to
a40b30b
Compare
Fixes DOC-9612
This PR adds further detail on locality-aware backups and clarifies some behavior, (refer to the links for rendered previews):
Note/question: We do not currently mention l-a backups in our data domiciling docs or multi-region docs, so this PR does not change anything there yet.