Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for passing in the MySQL shutdown timeout #14568

Conversation

dbussink
Copy link
Contributor

We have a hardcoded timeout today of 300 seconds when we call mysqladmin during a MySQL shutdown. This means that for example when we have fast shutdown disabled for an upgrade, we still kill after 5 minutes and don't allow for a longer shutdown.

Large tablets with a lot of data can definitely take longer than 5 minutes, so this hardcoded timeout leads to failures in this case which means we can't easily upgrade the tablet and need to do a full replacement from a backup.

By allowing this to be configured, we can make sure that we can give it longer timeouts when we want to do an upgrade.

Related Issue(s)

Fixes #14567

Checklist

  • "Backport to:" labels have been added if this change should be back-ported
  • Tests were added or are not required
  • Did the new or modified tests pass consistently locally and on the CI
  • Documentation was added or is not required

Copy link
Contributor

vitess-bot bot commented Nov 21, 2023

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • Ensure there is a link to an issue (except for internal cleanup and flaky test fixes), new features should have an RFC that documents use cases and test cases.

Tests

  • Bug fixes should have at least one unit or end-to-end test, enhancement and new features should have a sufficient number of tests.

Documentation

  • Apply the release notes (needs details) label if users need to know about this change.
  • New features should be documented.
  • There should be some code comments as to why things are implemented the way they are.
  • There should be a comment at the top of each new or modified test to explain what the test does.

New flags

  • Is this flag really necessary?
  • Flag names must be clear and intuitive, use dashes (-), and have a clear help text.

If a workflow is added or modified:

  • Each item in Jobs should be named in order to mark it as required.
  • If the workflow needs to be marked as required, the maintainer team must be notified.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • RPC changes should be compatible with vitess-operator
  • If a flag is removed, then it should also be removed from vitess-operator and arewefastyet, if used there.
  • vtctl command output order should be stable and awk-able.

@vitess-bot vitess-bot bot added NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsIssue A linked issue is missing for this Pull Request NeedsWebsiteDocsUpdate What it says labels Nov 21, 2023
@@ -84,8 +85,9 @@ func init() {
Main.Flags().IntVar(&mysqlPort, "mysql_port", mysqlPort, "MySQL port")
Main.Flags().Uint32Var(&tabletUID, "tablet_uid", tabletUID, "Tablet UID")
Main.Flags().StringVar(&mysqlSocket, "mysql_socket", mysqlSocket, "Path to the mysqld socket file")
Main.Flags().DurationVar(&waitTime, "wait_time", waitTime, "How long to wait for mysqld startup or shutdown")
Main.Flags().DurationVar(&waitTime, "wait_time", waitTime, "How long to wait for mysqld startup")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fun fact, this config value was not used at all for shutdown, only for startup so call it out as such. Also we want these to really be separate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of fixing this to actually be used during both startup and shutdown? Then you could just pass the desired higher value. Are there downsides to increasing the startup wait time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deepthi I think they are very different concepts? How long you want to wait for startup can be very different I think than for shutdown.

Especially in the case that is relevant here for disabling fast shutdown on upgrades, you want to wait a lot longer (say 30 minutes) for shutdown, but waiting that long for startup is entirely unneeded and likely not desired since you want to know earlier if you have a problem starting up MySQL.

That's why I think they must be separate values really and assuming the same value for both doesn't make sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it's effectively a wait time for the mysqlctl client command / mysqlctld RPC, no? In that case I also think we should just have a single wait timeout flag. I'm OK with adding another flag, if it's truly necessary. Looks like this is also relevant for other programs though such as vttablet, so having an additional flag there is required and there's IMO less reason to avoid that for mysqlctl[d].

Copy link
Contributor Author

@dbussink dbussink Nov 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it's effectively a wait time for the mysqlctl client command / mysqlctld RPC, no?

Not really I think. The only operation that uses it is the start command. Also I would not classify it as an RPC, because this is for the internal "starting MySQL" phase and it's not used for RPC to mysqlctld.

I'm OK with adding another flag, if it's truly necessary.

I don't see a way without it. Startup is a fundamentally different thing from shutdown and therefore I think they need separate controls.

@github-actions github-actions bot added this to the v19.0.0 milestone Nov 21, 2023
@dbussink dbussink added Type: Enhancement Logical improvement (somewhere between a bug and feature) Component: TabletManager and removed NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsWebsiteDocsUpdate What it says NeedsIssue A linked issue is missing for this Pull Request labels Nov 22, 2023
Copy link
Contributor

@mattlord mattlord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @dbussink ! I only had some minor comments/suggestions so approving and we can discuss if/as needed.

@@ -84,8 +85,9 @@ func init() {
Main.Flags().IntVar(&mysqlPort, "mysql_port", mysqlPort, "MySQL port")
Main.Flags().Uint32Var(&tabletUID, "tablet_uid", tabletUID, "Tablet UID")
Main.Flags().StringVar(&mysqlSocket, "mysql_socket", mysqlSocket, "Path to the mysqld socket file")
Main.Flags().DurationVar(&waitTime, "wait_time", waitTime, "How long to wait for mysqld startup or shutdown")
Main.Flags().DurationVar(&waitTime, "wait_time", waitTime, "How long to wait for mysqld startup")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it's effectively a wait time for the mysqlctl client command / mysqlctld RPC, no? In that case I also think we should just have a single wait timeout flag. I'm OK with adding another flag, if it's truly necessary. Looks like this is also relevant for other programs though such as vttablet, so having an additional flag there is required and there's IMO less reason to avoid that for mysqlctl[d].

@@ -96,7 +97,7 @@ func NewMySQLWithMysqld(port int, hostname, dbName string, schemaSQL ...string)
}
return params, mysqld, func() {
ctx := context.Background()
_ = mysqld.Teardown(ctx, mycnf, true)
_ = mysqld.Teardown(ctx, mycnf, true, 30*time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth adding a default const for e2e tests too, IMO.

If we hit this value, will the test and workflow fail? If so, might be worth doubling it to avoid failures when CI is just really slow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

30 seconds seems long for our tests, but we can increase it if you think that's appropriate? I'll also move it to a constant.

We have a hardcoded timeout today of 300 seconds when we call mysqladmin
during a MySQL shutdown. This means that for example when we have fast
shutdown disabled for an upgrade, we still kill after 5 minutes and
don't allow for a longer shutdown.

Large tablets with a lot of data can definitely take longer than 5
minutes, so this hardcoded timeout leads to failures in this case which
means we can't easily upgrade the tablet and need to do a full
replacement from a backup.

By allowing this to be configured, we can make sure that we can give it
longer timeouts when we want to do an upgrade.

Signed-off-by: Dirkjan Bussink <[email protected]>
@dbussink dbussink force-pushed the dbussink/allow-configuring-mysql-shutdown-timeout branch from 7c50a4e to 1db6254 Compare November 29, 2023 16:23
@deepthi deepthi merged commit 5655236 into vitessio:main Nov 30, 2023
119 checks passed
@deepthi deepthi deleted the dbussink/allow-configuring-mysql-shutdown-timeout branch November 30, 2023 17:03
ejortegau pushed a commit to slackhq/vitess that referenced this pull request Dec 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: TabletManager Type: Enhancement Logical improvement (somewhere between a bug and feature)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature Request: MySQL shutdown needs a more flexible timeout
4 participants