Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Confluence Page multi-columns index on lastVisitedAt #3418

Merged
merged 1 commit into from
Jan 25, 2024

Conversation

flvndvd
Copy link
Contributor

@flvndvd flvndvd commented Jan 24, 2024

Description

This PR resolves #3415.

This PR introduces a new multicolumns index for the confluence_pages table, intended to support the data reporting API (see #3405) in retrieving the oldest updated time.

The query is SELECT "lastUpdatedAt" FROM confluence_pages WHERE "connectorId" = X ORDER BY "lastVisitedAt" DESC LIMIT 1. Based on PostgreSQL's documentation, a composite index is effective for optimizing both the WHERE clause and the ORDER BY directive.

This PR adds a multicolumn index on connectorId and lastVisitedAt. While it's challenging to assess the index's efficiency on a very small table, by turning off sequential scans using SET enable_seqscan = OFF; on my local setup, I managed to execute an explain query that confirms the index is being correctly utilized.

dust_connectors=> SET enable_seqscan = OFF;
SET
dust_connectors=> EXPLAIN SELECT * FROM confluence_pages WHERE "connectorId" = 32 ORDER BY "lastVisitedAt" DESC LIMIT 1;
                                                               QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.13..2.15 rows=1 width=603)
   ->  Index Scan Backward using confluence_pages_connector_id_last_visited_at on confluence_pages  (cost=0.13..12.24 rows=6 width=603)
         Index Cond: ("connectorId" = 32)
(3 rows)

Risk

Deploy Plan

This PR requires a migration, but pretty safe since it only create an index. It's safe to deploy in production before running the migration.

@flvndvd flvndvd self-assigned this Jan 24, 2024
@flvndvd flvndvd requested a review from spolu January 24, 2024 21:21
@flvndvd flvndvd marked this pull request as ready for review January 24, 2024 21:21
Copy link

github-actions bot commented Jan 24, 2024

Warnings
⚠️

Files in front/lib/models/ or connectors/src/lib/models/ have been modified and the PR has the migration-ack label. Don't forget to run the migration from the -edge infrastructure.

Generated by 🚫 dangerJS against 3483d35

@flvndvd flvndvd added the migration-ack 📁 Label to acknowledge that a migration is required. label Jan 25, 2024
@flvndvd flvndvd merged commit cbfc3a7 into main Jan 25, 2024
6 checks passed
@flvndvd flvndvd deleted the flav/add-confluence-pages-composite-index branch January 25, 2024 08:24
@flvndvd
Copy link
Contributor Author

flvndvd commented Jan 25, 2024

Migration applied ✅.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
migration-ack 📁 Label to acknowledge that a migration is required.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Caloris 2] [Confluence] Add index on lastVisitedAt
2 participants