Cometindex performance improvements #4851
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Describe your changes
This improves the performance of cometindex significantly, especially when many events need to be indexed.
Two accidental problems:
This alone caused quadratic query performance, which is really bad.
First of all, we only want to join the blocks and transactions after already having grouped the attributes together, to avoid adding a constant factor overhead, since some events may have a handful of attributes.
Second of all, we shouldn't be sorting or hash merging at all.
The query should be linear and streaming in complexity, and operate by scanning the events table in order, and then selectively plucking other tables columns using their indices references the event id.
This PR amends the query to make Postgres actually do this, mainly by informing it that only a single block or transaction will get joined with a transaction.
Some performance evidence
Previously, when starting up pindexer from scratch, it would take 200 seconds before being able to start processing events.
Now it takes milliseconds.
old query:
new query:
Checklist before requesting a review
If this code contains consensus-breaking changes, I have added the "consensus-breaking" label. Otherwise, I declare my belief that there are not consensus-breaking changes, for the following reason: indexing only.