Skip to content

Commit

Permalink
Add guidance on freeing up disk space more quickly (#18235)
Browse files Browse the repository at this point in the history
Fixes:

- DOC-8242
- DOC-9641

NB. Ported to v23.1 and v23.2 (did not port to v22.2 because patches did
not apply)
  • Loading branch information
rmloveland authored Feb 16, 2024
1 parent b77218e commit a0dd607
Show file tree
Hide file tree
Showing 25 changed files with 189 additions and 7 deletions.
1 change: 1 addition & 0 deletions src/current/_includes/v23.1/storage/free-up-disk-space.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
For instructions on how to free up disk space as quickly as possible after deleting data, see [How can I free up disk space quickly?]({% link {{ page.version.version }}/operational-faqs.md %}#how-can-i-free-up-disk-space-quickly)
Original file line number Diff line number Diff line change
@@ -1 +1 @@
**Expected values for a healthy cluster**: Used capacity should not persistently exceed 80% of the total capacity.
**Expected values for a healthy cluster**: Used capacity should not persistently exceed 80% of the total capacity.
1 change: 1 addition & 0 deletions src/current/_includes/v23.2/storage/free-up-disk-space.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
For instructions on how to free up disk space as quickly as possible after deleting data, see [How can I free up disk space quickly?]({% link {{ page.version.version }}/operational-faqs.md %}#how-can-i-free-up-disk-space-quickly)
2 changes: 1 addition & 1 deletion src/current/v23.1/cockroach-debug-ballast.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ The `cockroach debug ballast` [command]({% link {{ page.version.version }}/cockr

- Do not run `cockroach debug ballast` with a unix `root` user. Doing so brings the risk of mistakenly affecting system directories or files.
- `cockroach debug ballast` now refuses to overwrite the target ballast file if it already exists. This change is intended to prevent mistaken uses of the `ballast` command. Consider adding an `rm` command to scripts that integrate `cockroach debug ballast`, or provide a new file name every time and then remove the old file.
- In addition to placing a ballast file in each node's storage directory, it is important to actively [monitor remaining disk space]({% link {{ page.version.version }}/monitoring-and-alerting.md %}#events-to-alert-on).
- In addition to placing a ballast file in each node's storage directory, it is important to actively [monitor remaining disk space]({% link {{ page.version.version }}/monitoring-and-alerting.md %}#node-is-running-low-on-disk-space).
- Ballast files may be created in many ways, including the standard `dd` command. `cockroach debug ballast` uses the `fallocate` system call when available, so it will be faster than `dd`.

## Subcommands
Expand Down
2 changes: 2 additions & 0 deletions src/current/v23.1/common-issues-to-monitor.md
Original file line number Diff line number Diff line change
Expand Up @@ -277,6 +277,8 @@ CockroachDB requires disk space in order to accept writes and report node livene
Ensure that you [provision sufficient storage]({% link {{ page.version.version }}/recommended-production-settings.md %}#storage). If storage is correctly provisioned and is running low, CockroachDB automatically creates an emergency ballast file that can free up space. For details, see [Disks filling up]({% link {{ page.version.version }}/cluster-setup-troubleshooting.md %}#disks-filling-up).
{{site.data.alerts.end}}

{% include {{page.version.version}}/storage/free-up-disk-space.md %}

#### Disk IOPS

Insufficient disk I/O can cause [poor SQL performance](#service-latency) and potentially [disk stalls]({% link {{ page.version.version }}/cluster-setup-troubleshooting.md %}#disk-stalls).
Expand Down
2 changes: 2 additions & 0 deletions src/current/v23.1/delete.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,8 @@ the zone by setting `gc.ttlseconds` to a lower value, which will cause
garbage collection to clean up deleted objects (rows, tables) more
frequently.

{% include {{page.version.version}}/storage/free-up-disk-space.md %}

## Select performance on deleted rows

Queries that scan across tables that have lots of deleted rows will
Expand Down
2 changes: 1 addition & 1 deletion src/current/v23.1/import.md
Original file line number Diff line number Diff line change
Expand Up @@ -157,7 +157,7 @@ Imported tables are treated as new tables, so you must [`GRANT`]({% link {{ page
- All nodes are used during the import job, which means all nodes' CPU and RAM will be partially consumed by the `IMPORT` task in addition to serving normal traffic.
- To improve performance, import at least as many files as you have nodes (i.e., there is at least one file for each node to import) to increase parallelism.
- To further improve performance, order the data in the imported files by [primary key]({% link {{ page.version.version }}/primary-key.md %}) and ensure the primary keys do not overlap between files.
- An import job will pause if a node in the cluster runs out of disk space. See [Viewing and controlling import jobs](#viewing-and-controlling-import-jobs) for information on resuming and showing the progress of import jobs.
- An import job will pause if a node in the cluster runs out of disk space. See [Viewing and controlling import jobs](#viewing-and-controlling-import-jobs) for information on resuming and showing the progress of import jobs. {% include {{page.version.version}}/storage/free-up-disk-space.md %}
- An import job will [pause]({% link {{ page.version.version }}/pause-job.md %}) instead of entering a `failed` state if it continues to encounter transient errors once it has retried a maximum number of times. Once the import has paused, you can either [resume]({% link {{ page.version.version }}/resume-job.md %}) or [cancel]({% link {{ page.version.version }}/cancel-job.md %}) it.

For more detail on optimizing import performance, see [Import Performance Best Practices]({% link {{ page.version.version }}/import-performance-best-practices.md %}).
Expand Down
2 changes: 2 additions & 0 deletions src/current/v23.1/monitoring-and-alerting.md
Original file line number Diff line number Diff line change
Expand Up @@ -985,6 +985,8 @@ Currently, not all events listed have corresponding alert rule definitions avail

- **Rule definition:** Use the `StoreDiskLow` alert from our <a href="https://github.com/cockroachdb/cockroach/blob/master/monitoring/rules/alerts.rules.yml">pre-defined alerting rules</a>.

{% include {{page.version.version}}/storage/free-up-disk-space.md %}

#### Node is not executing SQL

- **Rule:** Send an alert when a node is not executing SQL despite having connections.
Expand Down
74 changes: 74 additions & 0 deletions src/current/v23.1/operational-faqs.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,78 @@ or about 6 GiB. With on-disk compression, the actual disk usage is likely to be

However, depending on your usage of time-series charts in the [DB Console]({% link {{ page.version.version }}/ui-overview-dashboard.md %}), you may prefer to reduce the amount of disk used by time-series data. To reduce the amount of time-series data stored, or to disable it altogether, refer to [Can I reduce or disable the storage of time-series data?](#can-i-reduce-or-disable-the-storage-of-time-series-data)

## Why is my disk usage not decreasing after deleting data?

{% comment %}
The below is a lightly edited version of https://stackoverflow.com/questions/74481018/why-is-my-cockroachdb-disk-usage-not-decreasing
{% endcomment %}

There are several reasons why disk usage may not decrease right after deleting data:

- [The data could be preserved for MVCC history.](#the-data-could-be-preserved-for-mvcc-history)
- [The data could be in the process of being compacted.](#the-data-could-be-in-the-process-of-being-compacted)

{% include {{page.version.version}}/storage/free-up-disk-space.md %}

### The data could be preserved for MVCC history

CockroachDB implements [Multi-Version Concurrency Control (MVCC)]({% link {{ page.version.version }}/architecture/storage-layer.md %}#mvcc), which means that it maintains a history of all mutations to a row. This history is used for a wide range of functionality: [transaction isolation]({% link {{ page.version.version }}/transactions.md %}#isolation-levels), historical [`AS OF SYSTEM TIME`]({% link {{ page.version.version }}/as-of-system-time.md %}) queries, [incremental backups]({% link {{ page.version.version }}/take-full-and-incremental-backups.md %}), [changefeeds]({% link {{ page.version.version }}/create-and-configure-changefeeds.md %}), [cluster replication]({% link {{ page.version.version }}/architecture/replication-layer.md %}), and so on. The requirement to preserve history means that CockroachDB "soft deletes" data: The data is marked as deleted by a tombstone record so that CockroachDB will no longer surface the deleted rows to queries, but the old data is still present on disk.

The length of history preserved by MVCC is determined by two things: the [`gc.ttlseconds`]({% link {{ page.version.version }}/configure-replication-zones.md %}#gc-ttlseconds) of the zone that contains the data, and whether any [protected timestamps]({% link {{ page.version.version }}/architecture/storage-layer.md %}#protected-timestamps) exist. You can check the range's statistics to observe the `key_bytes`, `value_bytes`, and `live_bytes`. The `live_bytes` metric reflects data that's not garbage. The value of `(key_bytes + value_bytes) - live_bytes` will tell you how much MVCC garbage is resident within a range.

This information can be accessed in the following ways:

- Using the [`SHOW RANGES`]({% link {{ page.version.version }}/show-ranges.md %}) SQL statement, which lists the above values under the names `live_bytes`, `key_bytes`, and `val_bytes`.
- In the DB Console, under [**Advanced Debug Page > Even more Advanced Debugging**]({% link {{ page.version.version }}/ui-debug-pages.md %}#even-more-advanced-debugging), click the **Range Status** link, which takes you to a page where the values are displayed in a tabular format like the following: `MVCC Live Bytes/Count | 2.5 KiB / 62 count`.

When data has been deleted for at least the duration specified by [`gc.ttlseconds`]({% link {{ page.version.version }}/configure-replication-zones.md %}#gc-ttlseconds), CockroachDB will consider it eligible for 'garbage collection'. Asynchronously, CockroachDB will perform garbage collection of ranges that contain significant quantities of garbage. Note that if there are backups or other processes that haven't completed yet but require the data, these processes may prevent the garbage collection of that data by setting a protected timestamp until these processes have completed.

For more information about how MVCC works, see [MVCC]({% link {{ page.version.version }}/architecture/storage-layer.md %}#mvcc).

### The data could be in the process of being compacted

When MVCC garbage is deleted by garbage collection, the data is still not yet physically removed from the filesystem by the [Storage Layer]({% link {{ page.version.version }}/architecture/storage-layer.md %}). Removing data from the filesystem requires rewriting the files containing the data using a process also known as [compaction]({% link {{ page.version.version }}/architecture/storage-layer.md %}#compaction), which can be expensive. The storage engine has heuristics to compact data and remove deleted rows when enough garbage has accumulated to warrant a compaction. It strives to always restrict the overhead of obsolete data (called the space amplification) to at most 10%. If a lot of data was just deleted, it may take the storage engine some time to compact the files and restore this property.

{% include {{page.version.version}}/storage/free-up-disk-space.md %}

## How can I free up disk space quickly?

If you've noticed that [your disk space is not freeing up quickly enough after deleting data](#why-is-my-disk-usage-not-decreasing-after-deleting-data), you can take the following steps to free up disk space more quickly. This example assumes a table `t`.

1. Lower the [`gc.ttlseconds` parameter]({% link {{ page.version.version }}/configure-replication-zones.md %}#gc-ttlseconds) to 10 minutes.

{% include_cached copy-clipboard.html %}
~~~ sql
ALTER TABLE t CONFIGURE ZONE USING gc.ttlseconds = 600;
~~~

1. Find the IDs of the [ranges]({% link {{ page.version.version }}/architecture/overview.md %}#architecture-range) storing the table data using [`SHOW RANGES`]({% link {{ page.version.version }}/show-ranges.md %}):

{% include_cached copy-clipboard.html %}
~~~ sql
SELECT range_id FROM [SHOW RANGES FROM TABLE t];
~~~

~~~
range_id
------------
68
69
70
...
~~~

1. Drop the table using [`DROP TABLE`]({% link {{ page.version.version }}/drop-table.md %}):

{% include_cached copy-clipboard.html %}
~~~ sql
DROP TABLE t;
~~~

1. Visit the [Advanced Debug page]({% link {{ page.version.version }}/ui-debug-pages.md %}) and click the link **Run a range through an internal queue** to visit the **Manually enqueue range in a replica queue** page. On this page, select **mvccGC** from the **Queue** dropdown and enter each range ID from the previous step. Check the **SkipShouldQueue** checkbox to speed up the MVCC [garbage collection]({% link {{ page.version.version }}/architecture/storage-layer.md %}#garbage-collection) process.

1. Monitor GC progress in the DB Console by watching the [MVCC GC Queue]({% link {{ page.version.version }}/ui-queues-dashboard.md %}#mvcc-gc-queue) and the overall disk space used as shown on the [Overview Dashboard]({% link {{ page.version.version }}/ui-overview-dashboard.md %}).

## What is the `internal-delete-old-sql-stats` process and why is it consuming my resources?

When a query is executed, a process records query execution statistics on system tables. This is done by recording [SQL statement fingerprints]({% link {{ page.version.version }}/ui-statements-page.md %}).
Expand Down Expand Up @@ -148,6 +220,8 @@ For more information about troubleshooting disk usage issues, see [storage issue
In addition to using ballast files, it is important to actively [monitor remaining disk space]({% link {{ page.version.version }}/common-issues-to-monitor.md %}#storage-capacity).
{{site.data.alerts.end}}

{% include {{page.version.version}}/storage/free-up-disk-space.md %}

## Why would increasing the number of nodes not result in more operations per second?

If queries operate on different data, then increasing the number of nodes should improve the overall throughput (transactions/second or QPS).
Expand Down
2 changes: 2 additions & 0 deletions src/current/v23.1/query-replication-reports.md
Original file line number Diff line number Diff line change
Expand Up @@ -513,6 +513,8 @@ SELECT DISTINCT * FROM report;

To give another example, let's say your cluster were similar to the one shown above, but configured with [tiered localities]({% link {{ page.version.version }}/cockroach-start.md %}#locality) such that you had split `us-east1` into `{region=us-east1,dc=dc1, region=us-east1,dc=dc2, region=us-east1,dc=dc3}`. In that case, you wouldn't expect any DC to be critical, because the cluster would "diversify" each range's location as much as possible across data centers. In such a situation, if you were to see a DC identified as a critical locality, you'd be surprised and you'd take some action. For example, perhaps the diversification process is failing because some localities are filled to capacity. If there is no disk space free in a locality, your cluster cannot move replicas there.

{% include {{page.version.version}}/storage/free-up-disk-space.md %}

## See also

- [Replication Controls]({% link {{ page.version.version }}/configure-replication-zones.md %})
Expand Down
4 changes: 4 additions & 0 deletions src/current/v23.1/recommended-production-settings.md
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,10 @@ We recommend provisioning volumes with {% include {{ page.version.version }}/pro
Under-provisioning storage leads to node crashes when the disks fill up. Once this has happened, it is difficult to recover from. To prevent your disks from filling up, provision enough storage for your workload, monitor your disk usage, and use a [ballast file]({% link {{ page.version.version }}/cluster-setup-troubleshooting.md %}#automatic-ballast-files). For more information, see [capacity planning issues]({% link {{ page.version.version }}/cluster-setup-troubleshooting.md %}#capacity-planning-issues) and [storage issues]({% link {{ page.version.version }}/cluster-setup-troubleshooting.md %}#storage-issues).
{{site.data.alerts.end}}

{{site.data.alerts.callout_success}}
{% include {{page.version.version}}/storage/free-up-disk-space.md %}
{{site.data.alerts.end}}

##### Disk I/O

Disks must be able to achieve {% include {{ page.version.version }}/prod-deployment/provision-disk-io.md %}.
Expand Down
2 changes: 1 addition & 1 deletion src/current/v23.1/restore.md
Original file line number Diff line number Diff line change
Expand Up @@ -254,7 +254,7 @@ CockroachDB does **not** support incremental-only restores.

- The `RESTORE` process minimizes its impact to the cluster's performance by distributing work to all nodes. Subsets of the restored data (known as ranges) are evenly distributed among randomly selected nodes, with each range initially restored to only one node. Once the range is restored, the node begins replicating it others.
- When a `RESTORE` fails or is canceled, partially restored data is properly cleaned up. This can have a minor, temporary impact on cluster performance.
- A restore job will pause if a node in the cluster runs out of disk space. See [Viewing and controlling restore jobs](#viewing-and-controlling-restore-jobs) for information on resuming and showing the progress of restore jobs.
- A restore job will pause if a node in the cluster runs out of disk space. See [Viewing and controlling restore jobs](#viewing-and-controlling-restore-jobs) for information on resuming and showing the progress of restore jobs. {% include {{page.version.version}}/storage/free-up-disk-space.md %}
- A restore job will [pause]({% link {{ page.version.version }}/pause-job.md %}) instead of entering a `failed` state if it continues to encounter transient errors once it has retried a maximum number of times. Once the restore has paused, you can either [resume]({% link {{ page.version.version }}/resume-job.md %}) or [cancel]({% link {{ page.version.version }}/cancel-job.md %}) it.

## Restoring to multi-region databases
Expand Down
2 changes: 2 additions & 0 deletions src/current/v23.1/ui-cluster-overview-page.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,8 @@ If a node is currently unavailable, the last-known capacity usage will be shown,
{% include {{ page.version.version }}/misc/available-capacity-metric.md %}
{{site.data.alerts.end}}

{% include {{page.version.version}}/storage/free-up-disk-space.md %}

## Node List

The **Node List** groups nodes by locality. The lowest-level locality tier is used to organize the Node List. Hover over a locality to see all localities for the group of nodes.
Expand Down
2 changes: 2 additions & 0 deletions src/current/v23.1/ui-storage-dashboard.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,8 @@ Metric | Description

{% include {{ page.version.version }}/prod-deployment/healthy-storage-capacity.md %}

{% include {{page.version.version}}/storage/free-up-disk-space.md %}

### Capacity metrics

The **Capacity** graph displays disk usage by CockroachDB data in relation to the maximum [store]({% link {{ page.version.version }}/architecture/storage-layer.md %}) size, which is determined as follows:
Expand Down
2 changes: 1 addition & 1 deletion src/current/v23.2/cockroach-debug-ballast.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ The `cockroach debug ballast` [command]({% link {{ page.version.version }}/cockr

- Do not run `cockroach debug ballast` with a unix `root` user. Doing so brings the risk of mistakenly affecting system directories or files.
- `cockroach debug ballast` now refuses to overwrite the target ballast file if it already exists. This change is intended to prevent mistaken uses of the `ballast` command. Consider adding an `rm` command to scripts that integrate `cockroach debug ballast`, or provide a new file name every time and then remove the old file.
- In addition to placing a ballast file in each node's storage directory, it is important to actively [monitor remaining disk space]({% link {{ page.version.version }}/monitoring-and-alerting.md %}#events-to-alert-on).
- In addition to placing a ballast file in each node's storage directory, it is important to actively [monitor remaining disk space]({% link {{ page.version.version }}/monitoring-and-alerting.md %}#node-is-running-low-on-disk-space).
- Ballast files may be created in many ways, including the standard `dd` command. `cockroach debug ballast` uses the `fallocate` system call when available, so it will be faster than `dd`.

## Subcommands
Expand Down
2 changes: 2 additions & 0 deletions src/current/v23.2/common-issues-to-monitor.md
Original file line number Diff line number Diff line change
Expand Up @@ -281,6 +281,8 @@ CockroachDB requires disk space in order to accept writes and report node livene
Ensure that you [provision sufficient storage]({% link {{ page.version.version }}/recommended-production-settings.md %}#storage). If storage is correctly provisioned and is running low, CockroachDB automatically creates an emergency ballast file that can free up space. For details, see [Disks filling up]({% link {{ page.version.version }}/cluster-setup-troubleshooting.md %}#disks-filling-up).
{{site.data.alerts.end}}

{% include {{page.version.version}}/storage/free-up-disk-space.md %}

#### Disk IOPS

Insufficient disk I/O can cause [poor SQL performance](#service-latency) and potentially [disk stalls]({% link {{ page.version.version }}/cluster-setup-troubleshooting.md %}#disk-stalls).
Expand Down
2 changes: 2 additions & 0 deletions src/current/v23.2/delete.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,8 @@ the zone by setting `gc.ttlseconds` to a lower value, which will cause
garbage collection to clean up deleted objects (rows, tables) more
frequently.

{% include {{page.version.version}}/storage/free-up-disk-space.md %}

## Select performance on deleted rows

Queries that scan across tables that have lots of deleted rows will
Expand Down
Loading

0 comments on commit a0dd607

Please sign in to comment.