Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pageserver: wait for lsn lease duration after transition into AttachedSingle #9024

Merged
merged 17 commits into from
Sep 19, 2024

Conversation

yliang412
Copy link
Contributor

@yliang412 yliang412 commented Sep 17, 2024

Part of #7497, closes #8890.

Problem

Since leases are in-memory objects, we need to take special care of them after pageserver restarts and while doing a live migration. The approach we took for pageserver restart is to wait for at least lease duration before doing first GC. We want to do the same for live migration. Since we do not do any GC when a tenant is in AttachedStale or AttachedMulti mode, only the transition from AttachedMulti to AttachedSingle requires this treatment.

Summary of changes

  • Added lsn_lease_deadline field in GcBlock::reasons: the tenant is temporarily blocked from GC until we reach the deadline. This information does not persist to S3.
  • In GCBlock::start, skip the GC iteration if we are blocked by the lsn lease deadline.
  • In TenantManager::upsert_location, set the lsn_lease_deadline to Instant::now() + lsn_lease_length so the granted leases have a chance to be renewed before we run GC for the first time after transitioned from AttachedMulti to AttachedSingle.

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
  • If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

  • Do not forget to reformat commit message to not include the above checklist

@yliang412 yliang412 marked this pull request as ready for review September 17, 2024 14:31
@yliang412 yliang412 requested a review from a team as a code owner September 17, 2024 14:32
@yliang412 yliang412 added the c/storage/pageserver Component: storage: pageserver label Sep 17, 2024
@koivunej koivunej changed the title pageserver: wait for lsn lease duration after transition into AttachedSignle pageserver: wait for lsn lease duration after transition into AttachedSingle Sep 17, 2024
Copy link

github-actions bot commented Sep 17, 2024

4968 tests run: 4804 passed, 0 failed, 164 skipped (full report)


Flaky tests (5)

Postgres 17

  • test_ondemand_wal_download_in_replication_slot_funcs: release-x86-64

Postgres 16

Postgres 15

Postgres 14

Code coverage* (full report)

  • functions: 31.9% (7425 of 23298 functions)
  • lines: 49.9% (59745 of 119838 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
d1b2bb1 at 2024-09-19T15:58:13.456Z :recycle:

Signed-off-by: Yuchen Liang <[email protected]>
Signed-off-by: Yuchen Liang <[email protected]>
Signed-off-by: Yuchen Liang <[email protected]>
@problame
Copy link
Contributor

(Removing myself from this review)

@problame problame removed their request for review September 18, 2024 13:33
Copy link
Member

@koivunej koivunej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I think this is looking good.

I am surprised how many gc sensitive tests we have ... but which were not sensitive to lsn lease initial wait before? I'll try to understand these more and possibly chat with you...

@yliang412 yliang412 enabled auto-merge (squash) September 19, 2024 15:05
@yliang412 yliang412 merged commit 1708743 into main Sep 19, 2024
78 checks passed
@yliang412 yliang412 deleted the yuchen/lsn-lease-attached-multi-to-single-safety branch September 19, 2024 16:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/storage/pageserver Component: storage: pageserver
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wait for the lease duration after transitioning to AttachedSingle, before doing any GC
3 participants