-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TableGC: speed up GC process via RequestChecks()
. Utilized by Online DDL for artifact cleanup
#14431
TableGC: speed up GC process via RequestChecks()
. Utilized by Online DDL for artifact cleanup
#14431
Conversation
… CLEANUP process Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
|
Signed-off-by: Shlomi Noach <[email protected]>
…d's validateAnyState() Signed-off-by: Shlomi Noach <[email protected]>
endtoend test fixed. |
Request for additional review 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only had minor comments so will let you resolve as you feel best.
I like this! I think it will make the behavior more intuitive for users based on related interactions I've had. Thanks!
// ctx, cancel := context.WithTimeout(ctx, tableTransitionExpiration+gc.CheckTablesReentryMinInterval) | ||
// defer cancel() | ||
|
||
// ticker := time.NewTicker(time.Second) | ||
// defer ticker.Stop() | ||
|
||
// for { | ||
// select { | ||
|
||
// case <-ctx.Done(): | ||
// case <-ticker.C: | ||
// } | ||
// } | ||
// time.Sleep(tableTransitionExpiration + gc.CheckTablesReentryMinInterval) | ||
// We're now both beyond table's timestamp as well as a tableGC interval |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to keep this commented code around?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed.
go/timer/suspendable_ticker_test.go
Outdated
@@ -0,0 +1,140 @@ | |||
/* | |||
Copyright 2020 The Vitess Authors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use 2023 here.
go/timer/suspendable_ticker_test.go
Outdated
func TestSuspendableTicker(t *testing.T) { | ||
ctx, cancel := context.WithCancel(context.Background()) | ||
defer cancel() | ||
ticker := NewSuspendableTicker(10*time.Millisecond, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it would be nicer to make this a variable and then we can use that as the base for later sleeps and asserts (*3 e.g.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic seems correct but I suspect we'll see some flakes in CI. Have we re-run this several times in a row to be sure it's at least mostly reliable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have.
collector.removePurgingTable(tableName) | ||
// Chances are, there's more tables waiting to be purged. Let's speed things by | ||
// requesting another purge, instead of waiting a full purgeReentranceInterval cycle | ||
purgeReentranceTicker.TickAfter(time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should time.Second be a variable? I'm unsure why we're using that specific value here. I think it's a minimum we may use in various places over time, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now a variable. And you're right, it's a minimum; first and foremost because it ensures the table name signature, which is partly composed of a timestamp, gets a unique value over time.
To be fair, the purge logic is going to eventually disappear. It's relevant to 5.7
, and to old 8.0
versions. Eventually we'll retire this.
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
…e DDL for artifact cleanup (vitessio#14431) Signed-off-by: Shlomi Noach <[email protected]>
Description
Normally, the TableGC mechanism checks for GC table statuses once per hour. It will transition tables to their next state, potentially dealing with that new state yet only one hour later. There's incentive to speed up the process. But instead of reducing the interval (causing more load on the
PRIMARY
), we choose to intelligently kick new check routines whenever there's a plausible reason -- or an explicit request.Any time a table changes state, calls for a new cycle that takes place within seconds. A public function,
RequestChecks()
, makes it possible to "inform" the GC mechanism of recent changes that can benefit from GC.To combat load, the check function is non-reentrant, and discards recurring or flooding calls made within a period of 10 seconds (ie. the function only agrees to run once per 10 seconds at most).
This is in turn used by Online DDL's state machine: right after a
ALTER VITESS_MIGRATION ... CLEANUP
(or similar command) is invoked, and assuming an actual artifact is affected, Online DDL calls GC'sRequestChecks()
to speed up the destruction and cleanup of said artifact tables.Related Issue(s)
Fixes #14427
Checklist
Deployment Notes