Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: full memtrie logic for range retain #12130

Merged
merged 4 commits into from
Sep 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 17 additions & 47 deletions core/store/src/trie/mem/loading.rs
Original file line number Diff line number Diff line change
Expand Up @@ -195,6 +195,7 @@ mod tests {
};
use crate::trie::mem::loading::load_trie_from_flat_state;
use crate::trie::mem::lookup::memtrie_lookup;
use crate::trie::mem::nibbles_utils::{all_two_nibble_nibbles, multi_hex_to_nibbles};
use crate::{DBCol, KeyLookupMode, NibbleSlice, ShardTries, Store, Trie, TrieUpdate};
use near_primitives::congestion_info::CongestionInfo;
use near_primitives::hash::CryptoHash;
Expand Down Expand Up @@ -300,80 +301,49 @@ mod tests {
check_maybe_parallelize(keys, true);
}

fn nibbles(hex: &str) -> Vec<u8> {
if hex == "_" {
return vec![];
}
assert!(hex.len() % 2 == 0);
hex::decode(hex).unwrap()
}

fn all_nibbles(hexes: &str) -> Vec<Vec<u8>> {
hexes.split_whitespace().map(|x| nibbles(x)).collect()
}

#[test]
fn test_memtrie_empty() {
check(vec![]);
}

#[test]
fn test_memtrie_root_is_leaf() {
check(all_nibbles("_"));
check(all_nibbles("00"));
check(all_nibbles("01"));
check(all_nibbles("ff"));
check(all_nibbles("0123456789abcdef"));
check(multi_hex_to_nibbles("_"));
check(multi_hex_to_nibbles("00"));
check(multi_hex_to_nibbles("01"));
check(multi_hex_to_nibbles("ff"));
check(multi_hex_to_nibbles("0123456789abcdef"));
}

#[test]
fn test_memtrie_root_is_extension() {
check(all_nibbles("1234 13 14"));
check(all_nibbles("12345678 1234abcd"));
check(multi_hex_to_nibbles("1234 13 14"));
check(multi_hex_to_nibbles("12345678 1234abcd"));
}

#[test]
fn test_memtrie_root_is_branch() {
check(all_nibbles("11 22"));
check(all_nibbles("12345678 22345678 32345678"));
check(all_nibbles("11 22 33 44 55 66 77 88 99 aa bb cc dd ee ff"));
check(multi_hex_to_nibbles("11 22"));
check(multi_hex_to_nibbles("12345678 22345678 32345678"));
check(multi_hex_to_nibbles("11 22 33 44 55 66 77 88 99 aa bb cc dd ee ff"));
}

#[test]
fn test_memtrie_root_is_branch_with_value() {
check(all_nibbles("_ 11"));
check(multi_hex_to_nibbles("_ 11"));
}

#[test]
fn test_memtrie_prefix_patterns() {
check(all_nibbles("10 21 2210 2221 222210 222221 22222210 22222221"));
check(all_nibbles("11111112 11111120 111112 111120 1112 1120 12 20"));
check(all_nibbles("11 1111 111111 11111111 1111111111 111111111111"));
check(all_nibbles("_ 11 1111 111111 11111111 1111111111 111111111111"));
check(multi_hex_to_nibbles("10 21 2210 2221 222210 222221 22222210 22222221"));
check(multi_hex_to_nibbles("11111112 11111120 111112 111120 1112 1120 12 20"));
check(multi_hex_to_nibbles("11 1111 111111 11111111 1111111111 111111111111"));
check(multi_hex_to_nibbles("_ 11 1111 111111 11111111 1111111111 111111111111"));
}

#[test]
fn test_full_16ary_trees() {
check(all_nibbles(
"
00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f
10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f
20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f
30 31 32 33 34 35 36 37 38 39 3a 3b 3c 3d 3e 3f
40 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f
50 51 52 53 54 55 56 57 58 59 5a 5b 5c 5d 5e 5f
60 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f
70 71 72 73 74 75 76 77 78 79 7a 7b 7c 7d 7e 7f
80 81 82 83 84 85 86 87 88 89 8a 8b 8c 8d 8e 8f
90 91 92 93 94 95 96 97 98 99 9a 9b 9c 9d 9e 9f
a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 aa ab ac ad ae af
b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 ba bb bc bd be bf
c0 c1 c2 c3 c4 c5 c6 c7 c8 c9 ca cb cc cd ce cf
d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 da db dc dd de df
e0 e1 e2 e3 e4 e5 e6 e7 e8 e9 ea eb ec ed ee ef
f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 fa fb fc fd fe ff
",
))
check(all_two_nibble_nibbles())
}

#[test]
Expand Down
2 changes: 2 additions & 0 deletions core/store/src/trie/mem/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ pub mod loading;
mod lookup;
pub mod mem_tries;
pub mod metrics;
#[cfg(test)]
pub(crate) mod nibbles_utils;
pub mod node;
mod parallel_loader;
pub mod resharding;
Expand Down
46 changes: 46 additions & 0 deletions core/store/src/trie/mem/nibbles_utils.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
/// Utilties for generating vectors of nibbles from human-readable strings.
///
/// Input for a single vector is a hex string, e.g. 5da3593f.
/// It has even length, as tries support only keys in bytes, thus keys of
/// odd nibble length do not occur.
/// Each symbol is interpreted as a nibble (half-byte).
/// Result is a vector of decoded hexes as nibbles, e.g.
/// [5, 13, 10, 3, 5, 9, 3, 15].

pub(crate) fn hex_to_nibbles(hex: &str) -> Vec<u8> {
if hex == "_" {
return vec![];
}
assert!(hex.len() % 2 == 0);
hex::decode(hex).unwrap()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that right? Hex decodes to bytes but nibbles are half-bytes. I do realize you only moved it and it's probably ok so I'm mostly curious where is the gap in my understanding.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the Robin's assumption was that we can only write actual bytes and we can't write key with odd number of nibbles. Maybe trie internals shouldn't be aware of that; on the other hand, vectors of even lengths cover all cases well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was expecting the bytes to be further split down into nibbles, so something like this:

hex::decode(hex).unwrap().iter().map(|byte| vec![byte & 0xf0 >> 4, byte & 0x0f]).collect_vec().concat()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, each string symbol represents a readable nibble (0..9, a..f). I'll add comment

}

/// Converts a string of hex strings separated by whitespaces into a vector of
/// vectors of nibbles. For example, "01 02 10" is converted to
/// [[0, 1], [0, 2], [1, 0]].
pub(crate) fn multi_hex_to_nibbles(hexes: &str) -> Vec<Vec<u8>> {
hexes.split_whitespace().map(|x| hex_to_nibbles(x)).collect()
}

pub(crate) fn all_two_nibble_nibbles() -> Vec<Vec<u8>> {
multi_hex_to_nibbles(
"
00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f
10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f
20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f
30 31 32 33 34 35 36 37 38 39 3a 3b 3c 3d 3e 3f
40 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f
50 51 52 53 54 55 56 57 58 59 5a 5b 5c 5d 5e 5f
60 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f
70 71 72 73 74 75 76 77 78 79 7a 7b 7c 7d 7e 7f
80 81 82 83 84 85 86 87 88 89 8a 8b 8c 8d 8e 8f
90 91 92 93 94 95 96 97 98 99 9a 9b 9c 9d 9e 9f
a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 aa ab ac ad ae af
b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 ba bb bc bd be bf
c0 c1 c2 c3 c4 c5 c6 c7 c8 c9 ca cb cc cd ce cf
d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 da db dc dd de df
e0 e1 e2 e3 e4 e5 e6 e7 e8 e9 ea eb ec ed ee ef
f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 fa fb fc fd fe ff
",
)
}
Loading
Loading