Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPIRE Server should try to auto-rotate on increased workload TTLs config #5530

Open
amoore877 opened this issue Oct 1, 2024 · 0 comments
Open
Labels
triage/in-progress Issue triage is in progress

Comments

@amoore877
Copy link
Member

Steps:

  1. Start spire-server with some X default_x509_svid_ttl and reasonable Y server CA TTL. spire-server should also be making use of .data journaling of its CAs. For specific values, let's say X is two days and Y is 24 days.
  2. spire-server should come up healthy
  3. Turn down spire-server, and update default_x509_svid_ttl and server CA to much higher values. For specifics, let's say 14 days and 168 days, respectively.

Reality:

spire-server will spawn up fine, but will print warnings like default_x509_svid_ttl is too high. SVIDs with shorter lifetimes may be issued. Please set default_x509_svid_ttl to ${internal_value} or less to guarantee the full default_x509_svid_ttl lifetime when CA rotations are scheduled.

spire-server is now in state of potentially truncating svid TTLs until it naturally rotates, which could be a while, especially if under the shorter CA TTL the next CA was already prepated.

Expectation:

When spire-server's CA TTL config is extended, this should be detected by spire-server. It should make an effort to:

  1. prepare a next CA (even if one already exists) using the new TTL.
  2. On success of above, rotate to that new CA.
  3. Prior to any of the successes above, we are still issuing SVIDs under old shorter CA- a further but likely more difficult enhancement would be to flag those for re-signing within the ecosystem in some way, though this may not strictly be necessary.

Result:

With (1) and (2) at least, spire-server will more quickly get its own CA in order with configuration, and thus leaf SVIDs will be signed with their new potentially longer TTLs more quickly as well.

(3) would be significant work and may not be worthwhile. But I think the rest may be somewhat straightforward?

Workaround:

Purge spire-server's .data directory and restart.

@rturner3 rturner3 added the triage/in-progress Issue triage is in progress label Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage/in-progress Issue triage is in progress
Projects
None yet
Development

No branches or pull requests

2 participants