Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Perf] Only use raw iterators with RocksDB and speed up ledger load #2561

Merged
merged 4 commits into from
Nov 4, 2024

Conversation

vicsn
Copy link
Contributor

@vicsn vicsn commented Oct 22, 2024

Motivation

Replaces #2518

Original message:

The signatures of iteration-related methods (with the rocksdb feature) have been haunting my heap profiles for a long time now, and while I initially believed this was something we could fine-tune with some configuration, all such attempts were unsatisfactory. It turns out that the "default" (higher-level) RocksDB iterators are just inherently inefficient, and only the DBRawIterator is able to avoid a massive number of allocations, some of which may remain lingering in the RSS and contribute to its inflation over time. The raw iterator is somewhat trickier to use correctly, but since it's what the regular iterator is built upon, there is nothing particularly novel about it, and we're already using it to count records.

With these changes I ran profiling with both an instrumented OS allocator and jemalloc and - contrary to the usual results - it was the latter that experienced larger relative differences when loading the ledger:

total allocs and temp allocs were both reduced by 98%
ledger load time was down 14%
max RSS was reduced by 13%

In addition, heaptrack ran 73% faster and produced a 99% smaller profile, which is very practical for future profiling needs.

Alongside these changes I realized that the current len_confirmed method - while working correctly and passing all tests - can be made more solid by including iterator validity checks; the only situation where this would truly be needed is if it was called on the very last map in the database and if the map was empty, but this commit guards against that edge case.

The final commit is a further improvement on one of the changes from #2515 (and made possible with the switch to raw iterators).

Test Plan

CI run link

Copy link
Collaborator

@zkxuerb zkxuerb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👏

Copy link
Collaborator

@evanmarshall evanmarshall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

One potential concern about data consistency and unexpected mutations while the database is running. Should be we iterating over snapshots to ensure that concurrent writes don't potentially cause rare & non-deterministic errors that are difficult to reproduce?

Snapshots should be pretty lightweight overall and as long as we ensure it falls out of scope after it's used, it may be a good way to avoid potential errors from this performance optimization.

@alzger alzger merged commit 460f637 into AleoNet:staging Nov 4, 2024
77 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants