-
-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update RangeTree MaxUpperBound when removing nodes #2152
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following up on these correctness bugs is great work, index ranges really needed the attention.
I don't fully understand how the RB tree works, it's kind of hard for me to comment on whether we're missing anything additional with the bound checks. @Hydrocharged might be better positioned to catch additional bugs. Another thing we could do is add more of the failing correctness tests as tests here if for some reason this bug isn't directly related to all of the regressions.
The thing we've already talked about and might be worth an RFC for improving the reliability of this whole module is switching from the range tree to a single interval merge algorithm. We only build one range collection for each filter expression after my refactors, and we can collect all of the filters before invoking merging. I.e. the mutability of ranges that the range tree was designed for is kind of moot. I think a linear collect, nlogn sort, linear merge would do the same thing as a range tree currently in a way that is easier to test. Adding tests for specific failure patterns is still useful regardless of whether it's before or after switching the algo.
sql/range_test.go
Outdated
"", | ||
rng: r(req(5)), | ||
exp: "RangeColumnExprTree\n" + | ||
"│ ┌── [11, 11] max: >11 color: 0\n" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the formatting is a bit confusing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Need more test variety for sure, but the tree logic looks good from some quick napkin diagrams.
}, | ||
{ | ||
name: "insert rebalance root from right", | ||
setupRngs: []sql.Range{r(req(7)), r(req(3)), r(req(11)), r(req(8)), r(req(9)), r(req(10)), r(req(12))}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like some tests that use more than equals. We can have tests of all less than, etc. and also combine them. Hypothetically, it could all work great for equals since it's the most basic case, and break down on the other types.
}, | ||
{ | ||
name: "remove root", | ||
setupRngs: []sql.Range{r(req(7)), r(req(3)), r(req(11)), r(req(9)), r(req(13))}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment goes for this test too, need more than equals.
sql/range_cut.go
Outdated
@@ -157,6 +159,11 @@ func (a Above) String() string { | |||
return fmt.Sprintf("Above[%v]", a.Key) | |||
} | |||
|
|||
// DebugString implements RangeCut. | |||
func (a Above) DebugString() string { | |||
return fmt.Sprintf(">%v", a.Key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get why these were added, but considering ranges have the concept of greater than and less than, these will cause more confusion than clarity. You'd have to look at the DebugString
functions to know that this isn't greater than key, but is instead above key. The original strings are much more clear with that, so I think these should be removed, or internally just call a.String()
.
Verification code from in the last RangeTree related PR caught some bad ranges in the sqllogictests.
The cause was not properly updating the MaxUpperBound in nodes for the rangetree when performing a remove operation, which led to missing connections when pruning ranges.
This also prints MaxUpperBound in the String/DebugString methods for easier debugging.