-
Notifications
You must be signed in to change notification settings - Fork 885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify compressed pathkey lookup #5960
Conversation
Codecov Report
@@ Coverage Diff @@
## main #5960 +/- ##
==========================================
- Coverage 87.03% 87.01% -0.03%
==========================================
Files 243 243
Lines 56023 55959 -64
Branches 12402 12382 -20
==========================================
- Hits 48760 48692 -68
+ Misses 4924 4905 -19
- Partials 2339 2362 +23
... and 34 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
/* | ||
* Note that we have to separately generate the parameterized path info | ||
* for decompressed chunk path. The compressed parameterized path only | ||
* checks the clauses on segmentby columns, not on the compressed | ||
* columns. | ||
*/ | ||
path->custom_path.path.param_info = | ||
get_baserel_parampathinfo(root, | ||
info->chunk_rel, | ||
compressed_path->param_info->ppi_req_outer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bug was hidden because in add_segmentby_to_equivalence_class
, besides the proper EquivalenceMembers for segmentby columns, we also generated nonsensical EMs for compressed columns. They translated to impossible join clauses and prevented the formation of parameterized paths where the decompressed and compressed parameterizations would be different.
@@ -1352,6 +1303,17 @@ add_segmentby_to_equivalence_class(EquivalenceClass *cur_ec, CompressionInfo *in | |||
if (context->current_col_info == NULL) | |||
continue; | |||
|
|||
if (!COMPRESSIONCOL_IS_SEGMENT_BY(context->current_col_info)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a bug, we didn't check if the column is a segmentby, and generated nonsensical EquivalenceMembers for compressed columns.
Filter: (("time" > 'Wed Jan 19 16:00:00 2000 PST'::timestamp with time zone) AND ("time" < 'Thu Jan 20 17:00:00 2000 PST'::timestamp with time zone)) | ||
Rows Removed by Filter: 1 | ||
-> Custom Scan (DecompressChunk) on _hyper_1_3_chunk met (actual rows=1 loops=2) | ||
Filter: (("time" > 'Wed Jan 19 16:00:00 2000 PST'::timestamp with time zone) AND ("time" < 'Thu Jan 20 17:00:00 2000 PST'::timestamp with time zone) AND ("*VALUES*".column1 = device_id) AND ("*VALUES*".column2 = v0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The redundant filter on device_id (duplicate of index filter) is another bug, will be fixed here: #5962
@mkindahl, @sb230132: please review this pull request.
|
f76295c
to
1435d14
Compare
|
||
if (compressed_em == NULL) | ||
{ | ||
/* Not a segmentby column, rest of pathkeys should be handled by compress_orderby */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to test this condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I double-checked and seems like it is redundant with the loop condition, so I changed it to an assertion.
* then other ec classes */ | ||
prepend_ec_for_seqnum(root, info, sort_info, var, sortop, nulls_first); | ||
/* | ||
* Create the EquivalenceClass for the sequence number, so that we can |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if sequence number is the right term here. Wouldn't compressed chunk be more appropriate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll update it to say "the sequence number column". The rest of the pathkeys is fulfilled by sorting on this column, so here we create a EC for it.
Thank you for this PR. I assume this change fixes the planner regression we discussed. Do we have a tsbench run for this PR? |
I didn't run tsbench, but locally the result was good, about 2x improvement in planning speed for the queries like the one you added to tsbench. |
When looking up a pathkey for compressed scan, we used to do a lot of work, including a quadratic lookup through all the equivalence members, to always arrive at the same canonical pathkey we started from. Just remove this useless code for a significant planning speedup. This uncovers two bugs in parameterization of decompressed paths and generation of equivalence members for segmentby columns, fix them as well.
916348c
to
a02935e
Compare
When looking up a pathkey for compressed scan, we used to do a lot of work, including a quadratic lookup through all the equivalence members, to always arrive at the same canonical pathkey we started from. Just remove this useless code for a significant planning speedup.
This uncovers two bugs, fix them as well (see comments below).
Disable-check: force-changelog-file