Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for pg_stat_user_indexes, pg_index_properties, pg_index_size_bytes #1071

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

michael-todorovic
Copy link

This PR adds support for pg_stat_user_indexes in order to get scan, read, fetch statistics about indexes. This can be used to find out unused/low-used indexes

This relates to issue #1065

@michael-todorovic michael-todorovic force-pushed the feat/pg_stat_user_indexes branch from 2606e0a to 69223b3 Compare October 11, 2024 09:25
@michael-todorovic michael-todorovic changed the title feat: add support for pg_stat_user_indexes feat: add support for pg_stat_user_indexes, pg_index_properties, pg_index_size_bytes Oct 11, 2024
Signed-off-by: Michael Todorovic <[email protected]>
@michael-todorovic
Copy link
Author

You can identify unused indexes with this promql:

count((pg_stat_user_indexes_idx_scan_total{environment="production"} < 15)
* on (datname, environment, indexrelname, instance, job, relname, schemaname)
group_left(is_primary, is_unique, is_ready, is_valid)
(pg_index_properties{is_primary="0"} * on (datname, environment, indexrelname, instance, job, relname, schemaname)
  pg_index_size_bytes > 104857600)) by (instance, schemaname, relname, indexrelname)

It searches for production indexes scanned less than 15 times, which are not primary keys (you need to keep those indexes for PG itself) and that are bigger than 100MB

The metrics themselves look like:

pg_stat_user_indexes_idx_scan_total{datname="my_db", environment="production", indexrelname="idx_name", instance="my_server", job="pg_exporter_preview", relname="my_table", schemaname="my_schema"} 0
pg_stat_user_indexes_idx_tup_fetch{datname="my_db", environment="production", indexrelname="idx_name", instance="my_server", job="pg_exporter_preview", relname="my_table", schemaname="my_schema"} 0
pg_stat_user_indexes_idx_tup_read{datname="my_db", environment="production", indexrelname="idx_name", instance="my_server", job="pg_exporter_preview", relname="my_table", schemaname="my_schema"} 0
pg_index_properties{datname="my_db", environment="production", indexrelname="idx_name", instance="my_server", is_primary="0", is_ready="1", is_unique="0", is_valid="1", job="pg_exporter_preview", relname="my_table", schemaname="my_schema"} 1
pg_index_size_bytes{datname="my_db", environment="production", indexrelname="idx_name", instance="my_server", job="pg_exporter_preview", relname="my_table", schemaname="my_schema"} 9654042624

@chris-downs
Copy link

To give this a bump, we are now using this branch in production with no issues and it's very useful. It would be great to get it into mainline.

Copy link
Contributor

@sysadmind sysadmind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to the comments on the code, all of the dates for the Copyright need to be updated to the current year for the files that you have created. I think the guys are good here, my feedback is mostly about being able to maintain this code long term.

"fmt"
"strings"

"log/slog"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why this import is in it's own section. This should go with the stdlib imports above.

)

func pgIndexQuery(columns []string) string {
return fmt.Sprintf("SELECT %s FROM pg_catalog.pg_stat_user_indexes s JOIN pg_catalog.pg_index i ON s.indexrelid = i.indexrelid WHERE i.indislive='1';", strings.Join(columns, ","))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't typically use a function to build the query unless we need dynamic columns. In this case it should just be a static query const.

var idxIsUnique, idxIsPrimary, idxIsValid, idxIsReady sql.NullBool
var idxSize sql.NullFloat64

r := []any{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that this slice adds anything. We can scan directly into the vars above.

if err := rows.Scan(r...); err != nil {
return err
}
datnameLabel := "unknown"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these metrics make sense when these values are not known? I feel that if too many are NULL/unknown, the metric is useless and should just be skipped.

propertiesLabels...,
)

sizeLabels := []string{datnameLabel, schemanameLabel, relnameLabel, indexrelnameLabel}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, what does the slice improve vs just using the values below?

indexSizeMetric = idxSize.Float64
}

propertiesLabels := []string{datnameLabel, schemanameLabel, relnameLabel, indexrelnameLabel, indexIsUniqueLabel, indexIsPrimaryLabel, indexIsValidLabel, indexIsReadyLabel}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, what does the slice improve vs just using the values below?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants