Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Technical Advisory 104309 #18309

Merged
merged 4 commits into from
Feb 21, 2024
Merged

Technical Advisory 104309 #18309

merged 4 commits into from
Feb 21, 2024

Conversation

mdlinville
Copy link
Contributor

@mdlinville mdlinville commented Feb 20, 2024

Technical Advisory 104309

Preview: src/current/advisories/a104309.md

Copy link

Files changed:

Copy link

netlify bot commented Feb 20, 2024

Deploy Preview for cockroachdb-api-docs canceled.

Name Link
🔨 Latest commit f1ae56d
🔍 Latest deploy log https://app.netlify.com/sites/cockroachdb-api-docs/deploys/65d541ce4bdfe90008292d56

Copy link

netlify bot commented Feb 20, 2024

Deploy Preview for cockroachdb-interactivetutorials-docs canceled.

Name Link
🔨 Latest commit f1ae56d
🔍 Latest deploy log https://app.netlify.com/sites/cockroachdb-interactivetutorials-docs/deploys/65d541ce543f440008a529c8

Copy link

netlify bot commented Feb 20, 2024

Netlify Preview

Name Link
🔨 Latest commit f1ae56d
🔍 Latest deploy log https://app.netlify.com/sites/cockroachdb-docs/deploys/65d541cef71b190008753ac1
😎 Deploy Preview https://deploy-preview-18309--cockroachdb-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Let me just doublecheck something about crdb_internal_mvcc_timestamp and logical timestamp components here.

src/current/advisories/a104309.md Outdated Show resolved Hide resolved
src/current/advisories/a104309.md Outdated Show resolved Hide resolved
src/current/advisories/a104309.md Outdated Show resolved Hide resolved
src/current/advisories/a104309.md Outdated Show resolved Hide resolved
src/current/advisories/a104309.md Outdated Show resolved Hide resolved
src/current/advisories/a104309.md Outdated Show resolved Hide resolved
src/current/advisories/a104309.md Show resolved Hide resolved
@mdlinville
Copy link
Contributor Author

Nick approved offline.

src/current/advisories/a104309.md Outdated Show resolved Hide resolved
Copy link

@nickelnickel nickelnickel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per discussion in slack, LGTM

@mdlinville mdlinville force-pushed the ta_104309 branch 2 times, most recently from bf5c75e to 090b287 Compare February 20, 2024 22:57

## Description

A [rangefeed](https://www.cockroachlabs.com/docs/stable/create-and-configure-changefeeds#enable-rangefeeds) bug may cause a [checkpoint](https://www.cockroachlabs.com/docs/stable/how-does-an-enterprise-changefeed-work) to be emitted prematurely in rare cases, before all writes below the checkpoint timestamp have been emitted. This could occur if nodes are overloaded or if long-running transactions (several seconds) are involved, such that intent resolution only replicates to a follower running a rangefeed more than 10 seconds after the transaction began. If a cluster using changefeeds experiences this bug, changefeeds will omit these write events, and the following error will be logged:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should link the expression, "write intent" to some reference reading. I think this is probably the best spot. https://www.cockroachlabs.com/docs/v23.2/architecture/transaction-layer#write-intents

Copy link
Contributor

@shannonbradshaw shannonbradshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've suggested a few changes. Feel free to push back if they don't make sense given your additional context.


The specific conditions to trigger the bug are:

1. An explicit or cross-range transaction commits, then asynchronously resolves all of its intents and removes its transaction record on all relevant leaseholders.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicit OR cross-range transactions is a strange subset of all transactions. Is it explicit AND cross-range?

Also, aren't all write intents removed async after the transaction commits during the cleanup stage? The language here just doesn't make sense to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this is written reads (to me) as if this behavior is unusual.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be worth clarifying. The intention here is "All explicit transactions and cross-range implicit transactions are affected."

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is that single-range implicit transactions don't write intents.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shralex Is the suggested edit accurate? https://github.com/cockroachdb/docs/pull/18309/files#r1496690828 (sorry about the HTML, it's a tooling edge case)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, LGTM!

src/current/advisories/a104309.md Outdated Show resolved Hide resolved

The specific conditions to trigger the bug are:

1. An explicit or cross-range transaction commits, then asynchronously resolves all of its intents and removes its transaction record on all relevant leaseholders.
Copy link
Contributor Author

@mdlinville mdlinville Feb 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. An explicit or cross-range transaction commits, then asynchronously resolves all of its intents and removes its transaction record on all relevant leaseholders.
1. A transaction commits, then asynchronously resolves all of its intents and removes its transaction record on all relevant leaseholders. To be affected by the bug, the transaction must be either an <a href="https://www.cockroachlabs.com/docs/stable/transactions#sql-statements">explicit transaction</a> with `BEGIN` and a `COMMIT` clauses, or an implicit transaction (single-statement write) that involves multiple ranges.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shannonbradshaw Is this more clear?

Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM including suggestions, thanks!

@Amruta-Ranade Amruta-Ranade merged commit b74e8da into main Feb 21, 2024
6 checks passed
@Amruta-Ranade Amruta-Ranade deleted the ta_104309 branch February 21, 2024 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants