Adjust the definition of min_checkpoint_frequency cdc #19232

kathancox · 2024-12-13T20:29:46Z

Fixes DOC-11981

Small adjustment to the definition of the min_checkpoint_frequency option.
Keywords highlighted in technical overview (that is linked to from edited definition above).

Rendered preview:

https://deploy-preview-19232--cockroachdb-docs.netlify.app/docs/v24.3/create-changefeed.html#min-checkpoint-frequency

github-actions · 2024-12-13T20:30:14Z

Files changed:

netlify · 2024-12-13T20:30:18Z

✅ Deploy Preview for cockroachdb-api-docs canceled.

Name	Link
🔨 Latest commit	`20bd7ed`
🔍 Latest deploy log	https://app.netlify.com/sites/cockroachdb-api-docs/deploys/6764431ade2b420009fb1361

netlify · 2024-12-13T20:30:18Z

✅ Deploy Preview for cockroachdb-interactivetutorials-docs canceled.

Name	Link
🔨 Latest commit	`20bd7ed`
🔍 Latest deploy log	https://app.netlify.com/sites/cockroachdb-interactivetutorials-docs/deploys/6764431aa9538c000839efd1

netlify · 2024-12-13T20:36:54Z

✅ Netlify Preview

Name	Link
🔨 Latest commit	`20bd7ed`
🔍 Latest deploy log	https://app.netlify.com/sites/cockroachdb-docs/deploys/6764431a68480b0008409a7d
😎 Deploy Preview	https://deploy-preview-19232--cockroachdb-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

kathancox · 2024-12-13T20:39:49Z

src/current/v23.1/create-changefeed.md

@@ -170,7 +170,7 @@ Option | Value | Description
 <span class="version-tag">New in v23.1:</span> <a name="key-column"></a>`key_column` | `'column'` | Override the key used in [message metadata]({% link {{ page.version.version }}/changefeed-messages.md %}). This changes the key hashed to determine downstream partitions. In sinks that support partitioning by message, CockroachDB uses the [32-bit FNV-1a](https://wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function) hashing algorithm to determine which partition to send to.<br><br>**Note:** `key_column` does not preserve ordering of messages from CockroachDB to the downstream sink, therefore you must also include the [`unordered`](#unordered) option in your changefeed creation statement. It does not affect per-key [ordering guarantees]({% link {{ page.version.version }}/changefeed-messages.md %}#ordering-and-delivery-guarantees) or the output of [`key_in_value`](#key-in-value).<br><br>See the [Define a key to determine the changefeed sink partition](#define-a-key-to-determine-the-changefeed-sink-partition) example.
 <a name="key-in-value"></a>`key_in_value` | N/A | Add a primary key array to the emitted message. This makes the [primary key]({% link {{ page.version.version }}/primary-key.md %}) of a deleted row recoverable in sinks where each message has a value but not a key (most have a key and value in each message). `key_in_value` is automatically used for [cloud storage sinks]({% link {{ page.version.version }}/changefeed-sinks.md %}#cloud-storage-sink), [webhook sinks]({% link {{ page.version.version }}/changefeed-sinks.md %}#webhook-sink), and [GC Pub/Sub sinks]({% link {{ page.version.version }}/changefeed-sinks.md %}#google-cloud-pub-sub).
 `metrics_label` | [`STRING`]({% link {{ page.version.version }}/string.md %}) | Define a metrics label to which the metrics for one or multiple changefeeds increment. All changefeeds also have their metrics aggregated.<br><br>The maximum length of a label is 128 bytes. There is a limit of 1024 unique labels.<br><br>`WITH metrics_label=label_name` <br><br>For more detail on usage and considerations, see [Using changefeed metrics labels]({% link {{ page.version.version }}/monitor-and-debug-changefeeds.md %}#using-changefeed-metrics-labels).
-<a name="min-checkpoint-frequency"></a>`min_checkpoint_frequency` | [Duration string](https://pkg.go.dev/time#ParseDuration) | Controls how often nodes flush their progress to the [coordinating changefeed node]({% link {{ page.version.version }}/how-does-an-enterprise-changefeed-work.md %}). Changefeeds will wait for at least the specified duration before a flush to the sink. This can help you control the flush frequency of higher latency sinks to achieve better throughput. However, more frequent checkpointing can increase CPU usage. If this is set to `0s`, a node will flush messages as long as the high-water mark has increased for the ranges that particular node is processing. If a changefeed is resumed, then `min_checkpoint_frequency` is the amount of time that changefeed will need to catch up. That is, it could emit [duplicate messages]({% link {{ page.version.version }}/changefeed-messages.md %}#duplicate-messages) during this time. <br><br>**Note:** [`resolved`](#resolved-option) messages will not be emitted more frequently than the configured `min_checkpoint_frequency` (but may be emitted less frequently). If you require `resolved` messages more frequently than `30s`, you must configure `min_checkpoint_frequency` to at least the desired `resolved` message frequency. For more details, refer to [Resolved message frequency]({% link {{ page.version.version }}/changefeed-messages.md %}#resolved-timestamp-frequency).<br><br>**Default:** `30s`
+<a name="min-checkpoint-frequency"></a>`min_checkpoint_frequency` | [Duration string](https://pkg.go.dev/time#ParseDuration) | Controls how often a node's changefeed [aggregator]({% link {{ page.version.version }}/how-does-an-enterprise-changefeed-work.md %}) will flush their progress to the [coordinating changefeed node]({% link {{ page.version.version }}/how-does-an-enterprise-changefeed-work.md %}). A node's changefeed aggregator will wait at least the specified duration between sending progress updates for the ranges it is watching to the coordinator. This can help you control the flush frequency of higher latency sinks to achieve better throughput. However, more frequent checkpointing can increase CPU usage. If this is set to `0s`, a node will flush messages as long as the high-water mark has increased for the ranges that particular node is processing. If a changefeed is resumed, then `min_checkpoint_frequency` is the amount of time that changefeed will need to catch up. That is, it could emit [duplicate messages]({% link {{ page.version.version }}/changefeed-messages.md %}#duplicate-messages) during this time. <br><br>**Note:** [`resolved`](#resolved-option) messages will not be emitted more frequently than the configured `min_checkpoint_frequency` (but may be emitted less frequently). If you require `resolved` messages more frequently than `30s`, you must configure `min_checkpoint_frequency` to at least the desired `resolved` message frequency. For more details, refer to [Resolved message frequency]({% link {{ page.version.version }}/changefeed-messages.md %}#resolved-timestamp-frequency).<br><br>**Default:** `30s`


A little hard to see, but the sentence that have changed in each version are from:

Controls how often nodes flush their progress to the [coordinating changefeed node]({% link {{ page.version.version }}/how-does-an-enterprise-changefeed-work.md %}). Changefeeds will wait for at least the specified duration before a flush to the sink.

To:

Controls how often a node's changefeed [aggregator]({% link {{ page.version.version }}/how-does-an-enterprise-changefeed-work.md %}) will flush their progress to the [coordinating changefeed node]({% link {{ page.version.version }}/how-does-an-enterprise-changefeed-work.md %}). A node's changefeed aggregator will wait at least the specified duration between sending progress updates for the ranges it is watching to the coordinator.

kathancox · 2024-12-13T20:40:37Z

src/current/v23.1/how-does-an-enterprise-changefeed-work.md

@@ -11,7 +11,7 @@ When an {{ site.data.products.enterprise }} changefeed is started on a node, tha
 {% include {{ page.version.version }}/cdc/work-distribution-setting.md %}
 {{site.data.alerts.end}}

-Each node uses its aggregator processors to send back checkpoint progress to the coordinator, which gathers this information to update the high-water mark timestamp. The high-water mark acts as a checkpoint for the changefeed’s job progress, and guarantees that all changes before (or at) the timestamp have been emitted. In the unlikely event that the changefeed’s coordinating node were to fail during the job, that role will move to a different node and the changefeed will restart from the last checkpoint. If restarted, the changefeed may [re-emit messages]({% link {{ page.version.version }}/changefeed-messages.md %}#duplicate-messages) starting at the high-water mark time to the current time. Refer to [Ordering Guarantees]({% link {{ page.version.version }}/changefeed-messages.md %}#ordering-and-delivery-guarantees) for detail on CockroachDB's at-least-once-delivery-guarantee and how per-key message ordering is applied.
+Each node uses its _aggregator processors_ to send back checkpoint progress to the coordinator, which gathers this information to update the _high-water mark timestamp_. The high-water mark acts as a checkpoint for the changefeed’s job progress, and guarantees that all changes before (or at) the timestamp have been emitted. In the unlikely event that the changefeed’s coordinating node were to fail during the job, that role will move to a different node and the changefeed will restart from the last checkpoint. If restarted, the changefeed may [re-emit messages]({% link {{ page.version.version }}/changefeed-messages.md %}#duplicate-messages) starting at the high-water mark time to the current time. Refer to [Ordering Guarantees]({% link {{ page.version.version }}/changefeed-messages.md %}#ordering-and-delivery-guarantees) for detail on CockroachDB's at-least-once-delivery-guarantee and how per-key message ordering is applied.


Just a small change to italicize key words on this technical overview page.

rharding6373

LGTM! Thank you for clarifying this.

rmloveland

LGTM

kathancox force-pushed the min-checkpoint-redraft branch from f56e904 to cdc7ba1 Compare December 13, 2024 20:38

kathancox commented Dec 13, 2024

View reviewed changes

kathancox marked this pull request as ready for review December 13, 2024 20:43

kathancox requested a review from rharding6373 December 13, 2024 20:45

rharding6373 approved these changes Dec 17, 2024

View reviewed changes

kathancox requested a review from rmloveland December 17, 2024 17:58

rmloveland approved these changes Dec 18, 2024

View reviewed changes

kathancox force-pushed the min-checkpoint-redraft branch 2 times, most recently from 0642cbb to cdc7ba1 Compare December 19, 2024 15:50

Adjust the definition of min_checkpoint_frequency cdc

20bd7ed

kathancox force-pushed the min-checkpoint-redraft branch from cdc7ba1 to 20bd7ed Compare December 19, 2024 16:00

kathancox merged commit 6a20a8c into main Dec 19, 2024
6 checks passed

kathancox deleted the min-checkpoint-redraft branch December 19, 2024 16:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adjust the definition of min_checkpoint_frequency cdc #19232

Adjust the definition of min_checkpoint_frequency cdc #19232

kathancox commented Dec 13, 2024 •

edited

Loading

github-actions bot commented Dec 13, 2024

netlify bot commented Dec 13, 2024 •

edited

Loading

netlify bot commented Dec 13, 2024 •

edited

Loading

netlify bot commented Dec 13, 2024 •

edited

Loading

kathancox Dec 13, 2024

kathancox Dec 13, 2024

rharding6373 left a comment

rmloveland left a comment

Adjust the definition of min_checkpoint_frequency cdc #19232

Adjust the definition of min_checkpoint_frequency cdc #19232

Conversation

kathancox commented Dec 13, 2024 • edited Loading

Rendered preview:

github-actions bot commented Dec 13, 2024

Files changed:

netlify bot commented Dec 13, 2024 • edited Loading

✅ Deploy Preview for cockroachdb-api-docs canceled.

netlify bot commented Dec 13, 2024 • edited Loading

✅ Deploy Preview for cockroachdb-interactivetutorials-docs canceled.

netlify bot commented Dec 13, 2024 • edited Loading

✅ Netlify Preview

kathancox Dec 13, 2024

Choose a reason for hiding this comment

kathancox Dec 13, 2024

Choose a reason for hiding this comment

rharding6373 left a comment

Choose a reason for hiding this comment

rmloveland left a comment

Choose a reason for hiding this comment

kathancox commented Dec 13, 2024 •

edited

Loading

netlify bot commented Dec 13, 2024 •

edited

Loading

netlify bot commented Dec 13, 2024 •

edited

Loading

netlify bot commented Dec 13, 2024 •

edited

Loading