-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect results: Bloom filters on UInt8
, Int8
, UInt16
and Int16
columns always return false negatives
#9779
Comments
I just reproduced the bug in the |
UInt8
, Int8
, UInt16
and Int16
columns always return false negatives
It turns out that #9770 demonstrates that the unsigned variants are incorrect as well so I updated the title of this ticket |
Not exactly: as long as #9770 is not merged, bloom filters are not used on Now that I say it, I realize that I should probably amend that PR (and the existing code) to disable bloom filters entirely on these types; so Datafusion is slow instead of incorrect. |
UInt8
, Int8
, UInt16
and Int16
columns always return false negativesUInt8
, Int8
, UInt16
and Int16
columns always return false negatives
This issue came up in the context of 37.1.0 release: #9904 and I wanted to cross post here Specifically, versions 34.0.0 through 37.0.0 have a bug where The int8/int16 bloom filter support was added in #7821 / shipped as part of https://github.com/apache/arrow-datafusion/blob/main/dev/changelog/33.0.0.md We have disabled using bloom filters for int8/int16 columns as of datafusion 38.0.0 (until we fix the underlying issue) |
Describe the bug
Bloom filters on these columns always filter out every value.
To Reproduce
#9778 demonstrates this, through
correct_bloom_filters: false
as macro "parameter".Expected behavior
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: