-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Domains: Move trace root capture from runtime to client for efficiency #3338
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wanted to check something that could impact performance (and correctness). Please feel free to ignore my review if it’s not ready yet!
Ok(Some(v)) | ||
} else { | ||
Ok(self.backend.get(key, prefix)?) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we missing a “tombstone” check here?
If a key is present in the backend, but deleted in the overlay, this method returns Some(old)
- the old value from the backend. But the correct value would be None
, because the key has been deleted in the overlay, and will be deleted in the backend once consolidated.
(If we never delete any keys, or we always “delete” by overwriting with another value, then there’s no bug here. But I just wanted to check.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question.
We are not missing any tombstone checks here. If the key is deleted, it will have a default value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the key is deleted, it will have a default value.
Just to be sure you are not talking about ValueQuery
here, right? because ValueQuery/OptionQuery
just define returning T::default()
or None
if the key-value does no exist in the state, it doesn't mean actually storing a T::default()
in the state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So just to confirm my understanding here: if a key is deleted in the overlay, the overlay will return Some(default)
, and we’ll never get to the backend returning Some(old)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. If the key is written(new/overwrite/delete) during the execution, then the memory DB will have the key present. If the value was not written to, then overlay memory db will not have it and we fallback to supporting backend
As a quick note: this is not new code and has been in the in the codebase for quite sometime powering our invalid state transaition Fraud proof :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I noticed the code had been moved 🙂
So if the key is deleted in the overlay, it will be present and set to what value?
(It can’t be None
, because then we’d ignore the deletion, and return the outdated value from the backend.)
Or should we be using contains()
in the if
condition, and get()
to retrieve the value (which can be None
for keys deleted in the overlay)?
That would reliably find deleted keys, if they hadn’t already been purged.
Either way, do we have a test for this behaviour?
I couldn’t find it documented clearly in the TrieBackendStorage
API reference, but they do say to use contains()
(and not get()
) to check for existence:
https://docs.rs/memory-db/0.32.0/memory_db/struct.MemoryDB.html#
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn’t find it documented clearly in the TrieBackendStorage API reference, but they do say to use contains() (and not get()) to check for existence:
Ideally does not matter since get returns an optional value.
we do have implicit tests for this through XDM and pallet domains(not sure yet.)
I did test locally sometime back creating a pallet specifically meant to check the overlay deletion and addition. These never made it. You can actually test this by creating a few storage types and update and check them during the block hooks, initialize and finalize.
This will clearly shows if the deletion works
52d6c2e
to
d2a9284
Compare
…dApi. These are essentially same roots as tested before the commit
d2a9284
to
fa4fce9
Compare
Benchmark results until this point
|
Benchmark results until this point
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense to me, but I just want to double-check a few things. It would be great to have another review, too!
@@ -28,9 +28,6 @@ sp_api::decl_runtime_apis! { | |||
tx_range: &U256, | |||
) -> bool; | |||
|
|||
/// Returns the intermediate storage roots in an encoded form. | |||
fn intermediate_roots() -> Vec<[u8; 32]>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to bump the API version when removing a method?
I think this means old client code will refuse to run with the new runtime due to a missing function, but that’s what we want, right? We want users to upgrade their nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we do client upgrades first before runtime releases. So all the clients will use client side intermediate roots and once runtime is upgraded the function completely disappears.
I still like to check running the client with new change on taurus to ensure both give same roots. I have verified these locally but verifying on taurus is never a bad thing :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great questions @teor2345 !
I trying to still optimise further and will spend more time today on this. In mean time feel free to review and run these changes locally if you prefer :)
@@ -28,9 +28,6 @@ sp_api::decl_runtime_apis! { | |||
tx_range: &U256, | |||
) -> bool; | |||
|
|||
/// Returns the intermediate storage roots in an encoded form. | |||
fn intermediate_roots() -> Vec<[u8; 32]>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we do client upgrades first before runtime releases. So all the clients will use client side intermediate roots and once runtime is upgraded the function completely disappears.
I still like to check running the client with new change on taurus to ensure both give same roots. I have verified these locally but verifying on taurus is never a bad thing :)
This is good a review now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to keep going on about deleted keys, but the only docs I could find suggested using a different method to check for existence.
Also it seems like the new timer calls could impact our benchmarks on some platforms and architectures.
Ok(Some(v)) | ||
} else { | ||
Ok(self.backend.get(key, prefix)?) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I noticed the code had been moved 🙂
So if the key is deleted in the overlay, it will be present and set to what value?
(It can’t be None
, because then we’d ignore the deletion, and return the outdated value from the backend.)
Or should we be using contains()
in the if
condition, and get()
to retrieve the value (which can be None
for keys deleted in the overlay)?
That would reliably find deleted keys, if they hadn’t already been purged.
Either way, do we have a test for this behaviour?
I couldn’t find it documented clearly in the TrieBackendStorage
API reference, but they do say to use contains()
(and not get()
) to check for existence:
https://docs.rs/memory-db/0.32.0/memory_db/struct.MemoryDB.html#
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes look good to me, as long as we're sure about layers and deletion
Fair. Added a test pallet that ensures this check and fails on any incorrect changes going forward @teor2345 |
Thanks for the solution @NingLin-P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks good!
I would like to get this in sooner. Would prefer another set of eyes on this @NingLin-P |
@NingLin-P what is blocking a review on this PR? As for the events and other large storage items, they are, imo, unrelated to this and can be pulled from fork or upstream at a later point. Let's get this in asap.Thanks! |
Signed-off-by: linning <[email protected]>
Add test for the custom runtime api
We have been capturing the trace roots in the Domain runtime before executing the next trasaction. This became problematic during our benchmark tests where calculating storage root is essentially time taking process since it does the calculate the intermediate roots from full set of changes from beginning instead of from the previous root. More on this here
This PR essentially moves capturing the trace roots by creating triebackend over uncommiteed changes to ensure the diff set is minimal during the state root calculation.
I have also took the opportunity to clean up a TODO in the Fraud proof where post state root was returned from the Domain runtime. This can be calculated on verifier from the overlay changes.Update: unfortunately, I found an interesting deviation that needs more research. Will provide a new PR when that TODO can be fixed and it does not relates to this PR anyway
Code contributor checklist: