Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor write buffer in lookupsIO #397

Merged
merged 1 commit into from
Sep 30, 2024
Merged

Conversation

dcoutts
Copy link
Collaborator

@dcoutts dcoutts commented Sep 24, 2024

See the individual patches for more detailed descriptions.

Copy link
Collaborator

@jorisdral jorisdral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some small suggestions, the rest looks good

src/Database/LSMTree/Internal/WriteBufferBlobs.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Normal.hs Show resolved Hide resolved
test/Test/Database/LSMTree/Internal/Lookup.hs Outdated Show resolved Hide resolved
test/Test/Database/LSMTree/Internal/Lookup.hs Outdated Show resolved Hide resolved
test/Test/Database/LSMTree/Internal.hs Outdated Show resolved Hide resolved
Previously lookupsIO dealt only with lookups in runs, and combining
results from the write buffer was done in D.L.Internal. Originally that
was ok because combining the write buffer results was simple, but it has
grown more complex with the blobs feature being properly enabled.

So now we pass it down into lookupsIO, specifically into
intraPageLookups where all the results are combined. It works nicely
here, since per-key results are accumulated into a mutable vector, so
the only change is to initialise that array with the write buffer
lookups. All the lookup results from runs are then accumulated on top.
This should also mildly reduce allocations.

In addition, instead of D.L.Internal.lookups getting passed in a
function for converting into the final result, just return the
intermediate representation and have the final conversion done in the
API wrappers. This approach is more friendly for inlining and
specialisation (so the deserialisation can be specialised at the call
site).

The benchmarks are updated but use an empty write buffer. Plausibly we
might want some micro benchmarks for lookupIO et al that do make use of
the write buffer too.
@dcoutts dcoutts force-pushed the dcoutts/lookupsIO-writebuffer branch from e1b27e8 to 3e22f31 Compare September 30, 2024 10:44
@dcoutts dcoutts force-pushed the dcoutts/lookupsIO-writebuffer branch from 3e22f31 to af0beab Compare September 30, 2024 12:58
@dcoutts dcoutts changed the title Refactor write buffer in lookupsIO and enable blob comparison in lookup tests Refactor write buffer in lookupsIO Sep 30, 2024
@dcoutts
Copy link
Collaborator Author

dcoutts commented Sep 30, 2024

I've removed the patch for enabling blobs in various tests since it doesn't need to be in this PR at all, and some of them need fixing. I'll open a separate PR for those.

@dcoutts dcoutts added this pull request to the merge queue Sep 30, 2024
Merged via the queue into main with commit 25ced18 Sep 30, 2024
24 checks passed
@dcoutts dcoutts deleted the dcoutts/lookupsIO-writebuffer branch September 30, 2024 14:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants