Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add id column as PRIMARY KEY for evm.logs & evm.log_poller_blocks #1441

Open
wants to merge 13 commits into
base: ccip-develop
Choose a base branch
from

Conversation

reductionista
Copy link
Collaborator

Problem:

The query for pruning expired logs with a max limit set was taking longer than it should. This was part due to needing to join on an awkward combination combination of columns due to their being no single primary key.

Solution:

  • Add id column as PRIMARY KEY for evm.logs & evm.log_poller_blocks
  • Join on id column instead of previous primary keys
  • Replace all SELECT *'s with helper functions for selecting all columns
  • Refactor nestedBlockQuery into withConfs, and make a bit more use of it## Motivation

While adding the id column, we can't just remove the old primary key because the index on it was helping to accelerate some queries.
Instead of just resurrecting it as-is, I took the opportunity to clean up several of the indices on the logs & blocks table. Some indexed columns (eg created_at) were never actually being used,
while others were not ordered in the most optimal way for accelerating the queries we have. Also, at least one of them was redundant with the primary key just in a different order.

@@ -1565,7 +1565,7 @@ func TestSelectLatestBlockNumberEventSigsAddrsWithConfs(t *testing.T) {
events: []common.Hash{event1, event2},
addrs: []common.Address{address1, address2},
confs: 0,
fromBlock: 3,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behavior looked wrong to me, hopefully there aren't any upstream clients depending on it:
everywhere else, passing a fromBlock implies you're searching for blocks >= fromBlock. This code has been updated in this PR to use that now instead of matching only on blocks > fromBlock

@reductionista reductionista requested a review from a team as a code owner September 18, 2024 02:13
  BCI-3492 [LogPoller]: Allow withObservedExecAndRowsAffected to report non-zero rows affected (#14057)

    * Fix withObservedExecAndRowsAffected

    Also:
    - Change behavior of DeleteExpiredLogs to delete logs which don't match any filter
    - Add a test case to ensure the dataset size is published properly during pruning

    * pnpm changeset

    * changeset #fix -> #bugfix
Also:
   - Add UNIQUE INDEXes to replace previous primary keys (still necessary, both for
     optimizing queries and for enforcing uniqueness constraints)
   - Replace all SELECT *'s with helper functions for selecting all columns
   - Refactor nestedBlockQuery into withConfs, and make a bit more use of it
Some of the columns in these indexes (such as created_at) are no longer used.
Others were not optimized for the queries we need.
Previously it was using fromBlock > :block_number which is inconsistent
with the other fromBlocks in queries
On a node with more than one chain, each LogPoller would have deleted all logs from chains it's not
running on! Because of the LEFT JOIN, ON evm_chain_id = $1 does not filter out any rows where evm_chain_id != $1;
only WHERE evm_chain_id = $1 can do that
@cl-sonarqube-production
Copy link

Quality Gate failed Quality Gate failed

Failed conditions
74.7% Coverage on New Code (required ≥ 75%)
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarQube

Catch issues before they fail your Quality Gate with our IDE extension SonarLint SonarLint

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant