WIP Snapshot Load Batch Queue #434

to11mtm · 2024-08-11T21:39:52Z

Will not fix #432 but will probably help on some level

Changes

This PR changes Getting default snapshot store load operation of getting the 'newest' snapshot (i.e. not caring about SEQ or timestamp) from being a multitude of snapshot requests into a queue, similar to how we handle Write requests on Journals.

Checklist

For significant changes, please ensure that the following have been completed (delete if not relevant):

This change follows the Akka.NET API Compatibility Guidelines.
This change follows the Akka.NET Wire Compatibility Guidelines.
I have reviewed my own pull request.
Design discussion issue #
Changes in public API reviewed, if any.
I have added website documentation for this feature.

Latest `dev` Benchmarks

We don't really have benchmarks against snapshot. if someone wants to add I'm happy to rebase/etc and test

This PR's Benchmarks

See above.

Stuff I still need to validate:

The main difference of course is that this is a -read- and more complex than a sequence number [0], and as such I did take the -small- step of splitting out deserialization from the main read logic as a separate stage.

This may have a minor performance impact on cases where only a small number of actors are recovering, however it should greatly improve recovery performance under load. Overall intent is to have the numbers be configurable for a given case (maybe we have a fallback switch for those who want old behavior?)

[0] - Hint hint ;)

to11mtm · 2024-08-11T22:18:37Z

This doesn't pipeline writes, but it does pipeline reads which...

is very useful for performance in real world, alas writing a benchmark for true 'contentious' states is not fun... but similar to how write batching helps there, this -will- help read batching, it's mostly a matter of right-sizing the params for the stream for your overall case (i.e. latency vs throughput is dependent on size and number of snapshots... big thing here is we may not be faster for all cases but should be more stable, but still tunable to get faster than before in almost all cases.

tl;dr- this is a change on reads will help overall perf in a real-world system most likely.

Is an example of how one -could- take the example further for other reads (e.x. expand this concept for other snapshot queries or perhaps for journal sequences, I would say 'other reads' but general journal reads are tricky for lots of reasons...)

tl;dr- we can do this pattern or a variant in other places for reads, and it will help with performance under load scenarios.

Is admittedly not optimized for all SQL cases. i.e. depending on dialect there are better ways to 'do' this, they aren't that ugly/hard to wire up. I didn't do them because I am trying to get better about overdoing first revisions. >_< But I know PG for instance can do -way- better using DISTINCT ON, see https://stackoverflow.com/a/34715134 . Also the 'what do I need to Option.None' logic could use improvement.

tl;dr- this PR is a basic implementation and may require some optimization to do the 'best thing' for each database. Or in general

to11mtm

Added my self-comments, just in case I get bussed/whatever.

to11mtm · 2024-08-11T22:35:35Z