Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: full memtrie logic for range retain #12130

Merged
merged 4 commits into from
Sep 24, 2024

Conversation

Longarithm
Copy link
Member

@Longarithm Longarithm commented Sep 23, 2024

Next step on #12074.
Supporting all cases I came up with where nodes restructuring is required.
Surprisingly, squash_node is enough to call. I only needed to implement trivial cases which weren't possible for single key deletion.
I implemented the tests first, and majority of them failed before changing the logic.
Each test is comparing naive approach with retain_multi_range.

Notes

  • I'm a bit scared that I didn't realise the need to squash Extension before. Fortunately, squash_node handles that, but if you feel some cases are not covered here, feel free to post suggestions!
  • Reused + copypasted some Robin' tooling to generate interesting nodes conversions.
  • Note that test for reading "extra" child node is not required because we always read all children.

Next steps

  • A bit more testing
  • Similar logic for partial trie
  • Generating intervals needed for resharding
  • Use that to implement shard switch on chain

@Longarithm Longarithm requested a review from a team as a code owner September 23, 2024 20:29
@Longarithm Longarithm changed the title feat: support squashing nodes for range retain feat: full memtrie logic for range retain Sep 23, 2024
Copy link

codecov bot commented Sep 23, 2024

Codecov Report

Attention: Patch coverage is 99.37500% with 2 lines in your changes missing coverage. Please review.

Project coverage is 71.57%. Comparing base (a033ff6) to head (d02a339).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
core/store/src/trie/mem/resharding.rs 99.54% 1 Missing ⚠️
core/store/src/trie/mem/updating.rs 98.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #12130      +/-   ##
==========================================
+ Coverage   71.52%   71.57%   +0.05%     
==========================================
  Files         820      821       +1     
  Lines      165091   165290     +199     
  Branches   165091   165290     +199     
==========================================
+ Hits       118080   118307     +227     
+ Misses      41870    41849      -21     
+ Partials     5141     5134       -7     
Flag Coverage Δ
backward-compatibility 0.17% <0.00%> (ø)
db-migration 0.17% <0.00%> (ø)
genesis-check 1.26% <0.00%> (-0.01%) ⬇️
integration-tests 38.70% <14.68%> (-0.02%) ⬇️
linux 71.35% <99.37%> (+0.03%) ⬆️
linux-nightly 71.16% <99.37%> (+0.06%) ⬆️
macos 54.03% <99.37%> (+0.31%) ⬆️
pytests 1.52% <0.00%> (-0.01%) ⬇️
sanity-checks 1.32% <0.00%> (-0.01%) ⬇️
unittests 65.30% <99.37%> (+0.05%) ⬆️
upgradability 0.21% <0.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@Trisfald Trisfald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@wacban wacban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

It's very elegant, I like it a lot that we can just squash at the end.

I didn't check all of the tests but the following transitions seem possible and test worthy:
empty -> empty
leaf -> empty / leaf
extension -> empty / leaf / extension / branch
branch -> empty / leaf / extension / branch

return vec![];
}
assert!(hex.len() % 2 == 0);
hex::decode(hex).unwrap()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that right? Hex decodes to bytes but nibbles are half-bytes. I do realize you only moved it and it's probably ok so I'm mostly curious where is the gap in my understanding.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the Robin's assumption was that we can only write actual bytes and we can't write key with odd number of nibbles. Maybe trie internals shouldn't be aware of that; on the other hand, vectors of even lengths cover all cases well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was expecting the bytes to be further split down into nibbles, so something like this:

hex::decode(hex).unwrap().iter().map(|byte| vec![byte & 0xf0 >> 4, byte & 0x0f]).collect_vec().concat()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, each string symbol represents a readable nibble (0..9, a..f). I'll add comment

hex::decode(hex).unwrap()
}

pub(crate) fn all_nibbles(hexes: &str) -> Vec<Vec<u8>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: how about something like multi_hex_to_nibbles?

hexes.split_whitespace().map(|x| hex_to_nibbles(x)).collect()
}

pub(crate) fn full_16ary_tree_keys() -> Vec<Vec<u8>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: How about all_two_nibble_nibbles or all_eight_bit_nibbles?

Comment on lines 141 to 147
self.place_node(
node_id,
UpdatedMemTrieNode::Extension {
extension,
child: OldOrUpdatedNodeId::Updated(new_child_id),
},
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think it would be more readable multilined

{
*child = None;
}
pub(crate) fn squash_node(&mut self, node_id: UpdatedMemTrieNodeId) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the comment please?

}
}

self.squash_node(node_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The correctness of squashing depends on it being called on nodes in the correct order.

sanity check: Is it called in the right order here?

nit: Can you add a comment about it for future us?

let (mut trie_changes, _) = update.retain_multi_range(&[retain_range]);
let memtrie_changes = trie_changes.mem_trie_changes.take().unwrap();
let new_state_root = memtries.apply_memtrie_changes(1, &memtrie_changes);
fn check_random(max_key_len: usize, max_keys_count: usize, test_count: usize) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

@Longarithm Longarithm added this pull request to the merge queue Sep 24, 2024
Merged via the queue into near:master with commit 93c5575 Sep 24, 2024
28 of 29 checks passed
@Longarithm Longarithm deleted the rr-squash-branch branch September 24, 2024 13:55
@Longarithm
Copy link
Member Author

Sanity check:

empty -> empty

checked everywhere if at least 1 node is removed

leaf -> empty / leaf

one of the simplest, test_retain_single_range. depends if leaf is retained or not

extension -> empty

test_descend_and_remove

extension -> leaf

test_extend_extensions

extension -> extension

actually was tricky.
squashing happens in test_prefixes
there is a sequence ext -> branch -> ext, branch is removed, two exts are merged.

branch -> empty

test_descend_and_remove

branch -> leaf

test_branch_to_leaf

branch -> extension

test_branch_to_extension

branch -> branch

many of them, e.g. test_full_16ary

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants