Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexer could use smaller hash indexes instead of btree indexes #890

Open
urtho opened this issue Feb 21, 2022 · 1 comment
Open

Indexer could use smaller hash indexes instead of btree indexes #890

urtho opened this issue Feb 21, 2022 · 1 comment
Labels
new-feature-request Feature request that needs triage performance Team Lamprey

Comments

@urtho
Copy link
Contributor

urtho commented Feb 21, 2022

Problem

All indexer indices are btree but not all have to be.

Solution

Replace btree indexes with hash indexes in places where only exact searches are used and the resulting hash index is significantly smaller.

One such example could be :

ledgerdb=# create index concurrently txn_by_tixid_hash on txn using hash (txid);
CREATE INDEX

ledgerdb=# \di+
                                         List of relations
 Schema |             Name              | Type  | Owner |       Table       |  Size   | Description
--------+-------------------------------+-------+-------+-------------------+---------+-------------
 public | txn_by_tixid                  | index | algo  | txn               | 43 GB   |
 public | txn_by_tixid_hash             | index | algo  | txn               | 16 GB   |

Urgency

Performance optimization.

@urtho urtho added the new-feature-request Feature request that needs triage label Feb 21, 2022
@oroechimaru
Copy link

Hash or nonclustered may be good for mixed data sets, if clustering/ordering lookups on common keys are not needed:
https://postgrespro.com/blog/pgsql/4161321

Do you have an example proc, table and index to look into?

I am not up to speed on postgres, but in general if page or row or columnstore compression can be leveraged, that would also reduce disk utilization depending on the index type.

If summarized results are needed clustered columnstore type indexing may offer more compression for summary performance but poor insert/update maintenance, although for large data warehouses they can also help compress wide complex data sets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new-feature-request Feature request that needs triage performance Team Lamprey
Projects
None yet
Development

No branches or pull requests

3 participants