Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RangeHeapJoin should consistently sort NULL values before non-NULL values while managing its heap. #2278

Merged
merged 1 commit into from
Jan 20, 2024

Conversation

nicktobey
Copy link
Contributor

Fixes dolthub/dolt#7260

This was ultimately caused by #1903. I didn't think it was possible for that issue to cause user-facing problems, but I was wrong. Because of that issue, RangeHeapJoins considered all NULL values in its children iterators to come after all non-NULL values. However, if the child node was an index, then the child iterator would order its rows with the NULL values first. This causes the RangeHeapIterator to mismanage the heap and skip rows that should have been in the results.

I updated the range heap code to manually check for NULL values when manipulating the heap. I also updated the plan tests to include NULL values in the test tables, which should now catch this issue.

Copy link
Contributor

@max-hoffman max-hoffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, our compare functions reversing null order seems wrong, I think I've had to do this same change in other places. If anything seems like auto-increment should be special cased instead at some point

@nicktobey
Copy link
Contributor Author

Our compare function is absolutely wrong, but at the time the work to fix it (and update anything that was accidentally relying on the old behavior) felt not worth it. But if we're going to run into more issues like this, we should absolutely fix it.

@nicktobey nicktobey merged commit b80ed6f into main Jan 20, 2024
8 checks passed
@nicktobey nicktobey deleted the nicktobey/rangeheap branch January 20, 2024 00:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unexpected Results when Using BETWEEN AND after CREATE INDEX
2 participants