Skip to content

Move freely on the entries of a grenad file

Latest
Compare
Choose a tag to compare
@Kerollmops Kerollmops released this 12 Oct 11:39
· 78 commits to main since this release
13b9c94

This version v0.4.0 brings a new era for the grenad crate: The ability to move on the keys of a grenad file.

Grenad is a library that provides tools to sort, merge, write, and read immutable key-value pairs. It is highly inspired by the SSTable format of RocksDB and by the MTBL c library.

let mut writer = Writer::memory();

// We insert our key-value pairs in lexicographic order.
writer.insert("first-counter", 119_u32.to_ne_bytes())?;
writer.insert("second-counter", 384_u32.to_ne_bytes())?;

// We create a reader from our writer.
let cursor = writer.into_inner().map(Cursor::new)?;
let mut cursor = Reader::new(cursor)?.into_cursor()?;

// We can see that the sum of u32s is valid here.
assert_eq!(cursor.move_on_next()?, Some((&b"first-counter"[..], &119_u32.to_ne_bytes()[..])));
assert_eq!(cursor.move_on_next()?, Some((&b"second-counter"[..], &384_u32.to_ne_bytes()[..])));
assert_eq!(cursor.move_on_next()?, None);

// We can also jump on any given entry.
assert_eq!(cursor.move_on_key_greater_than_or_equal_to("first")?, Some((&b"first-counter"[..], &119_u32.to_ne_bytes()[..])));
assert_eq!(cursor.move_on_key_equal_to("second-counter")?, Some((&b"second-counter"[..], &384_u32.to_ne_bytes()[..])));
assert_eq!(cursor.move_on_key_lower_than_or_equal_to("abracadabra")?, None);

It is now possible to use the ReaderCursor methods:

  • move_on_first: to move on the first entry of a grenad file.
  • move_on_last: to move on the last entry of a grenad file.
  • move_on_next: to move on the entry that is lying after the currently pointed one. Automatically moves on the first key when the cursor is uninitialized.
  • move_on_prev: to move on the entry that is lying before the currently pointed one. Automatically moves on the last key when the cursor is uninitialized.
  • move_on_key_lower_than_or_equal_to: to move on an entry that is either equal or lower than the given one. Returns None if no entry validates these predicates.
  • move_on_key_greater_than_or_equal_to: to move on an entry that is either equal or greater than the given one. Returns None if no entry validates these predicates.
  • move_on_key_equal_to: to move on an entry that is exactly equal to the given key. Returns None if no entry validates these predicates.

Along with these changes, we bring a small but, yet, useful method to get the exact number of entries in a grenad file.

let mut writer = Writer::memory();

// We insert our key-value pairs in lexicographic order.
writer.insert("aaa-cat-counter", 121_u32.to_ne_bytes())?;
writer.insert("aaa-cow-counter", 456_u32.to_ne_bytes())?;
writer.insert("aaa-dog-counter", 101_u32.to_ne_bytes())?;
writer.insert("aaa-horse-counter", 001_u32.to_ne_bytes())?;
writer.insert("bbb-turtoise-counter", 121_u32.to_ne_bytes())?;

// We create a reader from our writer.
let cursor = writer.into_inner().map(Cursor::new)?;
let mut prefix_iter = Reader::new(cursor)?.into_prefix_iter(b"aaa".to_vec()])?;

// We can see that the sum of u32s is valid here.
assert_eq!(cursor.next()?, Some((&b"aaa-cat-counter"[..], &121_u32.to_ne_bytes()[..])));
assert_eq!(cursor.next()?, Some((&b"aaa-cow-counter"[..], &456_u32.to_ne_bytes()[..])));
assert_eq!(cursor.next()?, Some((&b"aaa-dog-counter"[..], &101_u32.to_ne_bytes()[..])));
assert_eq!(cursor.next()?, Some((&b"aaa-horse-counter"[..], &001_u32.to_ne_bytes()[..])));
assert_eq!(cursor.next()?, None);

The ReaderCursor struct is very useful to move freely between the entries of a grenad file,
but we also bring a set of iterators to help us get the job done:

  • PrefixIter: let you iterate on entry where the keys start with a given prefix.
  • RevPrefixIter: same as the PrefixIter but moves from the last key that starts with the given prefix to the first one.
  • RangeIter: let you iterate on keys in a given range defined by the Rangebounds trait.
  • RevRangeIter: same as the RangeIter but in reverse order.

Non-idiomatic Iterators and Cursors

As grenad uses a compression and decompression system and we prefer keeping good performances we can't use the idiomatic Iterator trait for the same reason that we can't declare a Lines iterator that yields &str and that this kind of iterator must return owned values.

The grenad ReaderCursor returns &[u8] for the key and values that are views into the internal buffer of the cursor. When the GAT's are released we will be able to maybe declare our iterators by using something like a StreamingIterator trait or something similar.