-
Notifications
You must be signed in to change notification settings - Fork 363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop relying on ChannelMonitor persistence after manager read #3322
base: main
Are you sure you want to change the base?
Stop relying on ChannelMonitor persistence after manager read #3322
Conversation
A bunch of tests are still trying to call |
In aa09c33 we added a new secret in `ChannelManager` with which to derive inbound `PaymentId`s. We added read support for the new field, but forgot to add writing support for it. Here we fix this oversight.
When we started tracking which channels had MPP parts claimed durably on-disk in their `ChannelMonitor`, we did so with a tuple. This was fine in that it was only ever accessed in two places, but as we will start tracking it through to the `ChannelMonitor`s themselves in the coming commit(s), it is useful to have it in a struct instead.
When we claim an MPP payment, then crash before persisting all the relevant `ChannelMonitor`s, we rely on the payment data being available in the `ChannelManager` on restart to re-claim any parts that haven't yet been claimed. This is fine as long as the `ChannelManager` was persisted before the `PaymentClaimable` event was processed, which is generally the case in our `lightning-background-processor`, but may not be in other cases or in a somewhat rare race. In order to fix this, we need to track where all the MPP parts of a payment are in the `ChannelMonitor`, allowing us to re-claim any missing pieces without reference to any `ChannelManager` data. Further, in order to properly generate a `PaymentClaimed` event against the re-started claim, we have to store various payment metadata with the HTLC list as well. Here we take the first step, building a list of MPP parts and metadata in `ChannelManager` and passing it through to `ChannelMonitor` in the `ChannelMonitorUpdate`s.
When we claim an MPP payment, then crash before persisting all the relevant `ChannelMonitor`s, we rely on the payment data being available in the `ChannelManager` on restart to re-claim any parts that haven't yet been claimed. This is fine as long as the `ChannelManager` was persisted before the `PaymentClaimable` event was processed, which is generally the case in our `lightning-background-processor`, but may not be in other cases or in a somewhat rare race. In order to fix this, we need to track where all the MPP parts of a payment are in the `ChannelMonitor`, allowing us to re-claim any missing pieces without reference to any `ChannelManager` data. Further, in order to properly generate a `PaymentClaimed` event against the re-started claim, we have to store various payment metadata with the HTLC list as well. Here we store the required MPP parts and metadata in `ChannelMonitor`s and make them available to `ChannelManager` on load.
In a coming commit we'll use the existing `ChannelManager` claim flow to claim HTLCs which we found partially claimed on startup, necessitating having a full `ChannelManager` when we go to do so. Here we move the re-claim logic down in the `ChannelManager`-read logic so that we have that.
Here we wrap the logic which moves claimable payments from `claimable_payments` to `pending_claiming_payments` to a new utility function on `ClaimablePayments`. This will allow us to call this new logic during `ChannelManager` deserialization in a few commits.
In the next commit we'll start using (much of) the normal HTLC claim pipeline to replay payment claims on startup. In order to do so, however, we have to properly handle cases where we get a `DuplicateClaim` back from the channel for an inbound-payment HTLC. Here we do so, handling the `MonitorUpdateCompletionAction` and allowing an already-completed RAA blocker.
When we claim an MPP payment, then crash before persisting all the relevant `ChannelMonitor`s, we rely on the payment data being available in the `ChannelManager` on restart to re-claim any parts that haven't yet been claimed. This is fine as long as the `ChannelManager` was persisted before the `PaymentClaimable` event was processed, which is generally the case in our `lightning-background-processor`, but may not be in other cases or in a somewhat rare race. In order to fix this, we need to track where all the MPP parts of a payment are in the `ChannelMonitor`, allowing us to re-claim any missing pieces without reference to any `ChannelManager` data. Further, in order to properly generate a `PaymentClaimed` event against the re-started claim, we have to store various payment metadata with the HTLC list as well. Here we finally implement claiming using the new MPP part list and metadata stored in `ChannelMonitor`s. In doing so, we use much more of the existing HTLC-claiming pipeline in `ChannelManager`, utilizing the on-startup background events flow as well as properly re-applying the RAA-blockers to ensure preimages cannot be lost.
Ugh, sorry, fixed. Also rebased and fixed an issue introduced in #3303 |
98ecb3d
to
5437927
Compare
When we discover we've only partially claimed an MPP HTLC during `ChannelManager` reading, we need to add the payment preimage to all other `ChannelMonitor`s that were a part of the payment. We previously did this with a direct call on the `ChannelMonitor`, requiring users write the full `ChannelMonitor` to disk to ensure that updated information made it. This adds quite a bit of delay during initial startup - fully resilvering each `ChannelMonitor` just to handle this one case is incredibly excessive. Over the past few commits we dropped the need to pass HTLCs directly to the `ChannelMonitor`s using the background events to provide `ChannelMonitorUpdate`s insetad. Thus, here we finally drop the requirement to resilver `ChannelMonitor`s on startup.
Because the new startup `ChannelMonitor` persistence semantics rely on new information stored in `ChannelMonitor` only for claims made in the upgraded code, users upgrading from previous version of LDK must apply the old `ChannelMonitor` persistence semantics at least once (as the old code will be used to handle partial claims).
5437927
to
b0fa756
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3322 +/- ##
==========================================
- Coverage 89.68% 89.66% -0.02%
==========================================
Files 126 126
Lines 103168 103370 +202
Branches 103168 103370 +202
==========================================
+ Hits 92522 92686 +164
- Misses 7934 7971 +37
- Partials 2712 2713 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Tagging 0.1 because of the first commit. |
When we discover we've only partially claimed an MPP HTLC during
ChannelManager
reading, we need to add the payment preimage toall other
ChannelMonitor
s that were a part of the payment.We previously did this with a direct call on the
ChannelMonitor
,requiring users write the full
ChannelMonitor
to disk to ensurethat updated information made it.
This adds quite a bit of delay during initial startup - fully
resilvering each
ChannelMonitor
just to handle this one case isincredibly excessive.
Instead, we rewrite the MPP claim replay logic to use only (new) data included in
ChannelMonitor
s, which has a nice side-effect of teeing up futureChannelManager
-non-persistence features as well as makes ourPaymentClaimed
event generation much more robust.