-
Notifications
You must be signed in to change notification settings - Fork 794
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect values for is_null
and is_not_null
on UnionArray
#6017
Comments
@alamb I've realised we can significantly improve on this code now used in Datafusion: If children/columns of the union are marked as I think I can have a crack at this. |
Thanks @samuelcolvin -- I think this is another reason that putting kernsl into arrow-rs is often a good idea. In the arrow-rs code we can focus on implementing just that kernel and optimizing it really well, rather than the supporting machinery that is required in DataFusion |
@samuelcolvin are you porting your implementation to |
@gstvg please go for it. |
I'm not sure if implementing (Possibly that's just a separate but related bug?) |
Good point. I've always assumed logical_nulls should call logical_nulls on it's children too, but I don't known if it's explicit defined anywhere FWIW, RunArray calls logical_nulls for it's values |
I believe that is our interpretation of what the arrow spec says should be done, though as I recall it is somewhat vague on the point (e.g. should the validity bitmap in child dictionary arrays count as a |
Describe the bug
The
is_null
andis_not_null
kernels are incorrect forUnionArrays
I believe the core problem is that
UnionArray
does not implementArray::logical_nulls
arrow-rs/arrow-array/src/array/union_array.rs
Lines 445 to 525 in b9562b9
And instead falls back to the default implementation (which calls
Array::nulls
):arrow-rs/arrow-array/src/array/mod.rs
Lines 212 to 214 in b9562b9
To Reproduce
Run these tests in
is_null.rs
:Full diff
Expected behavior
The tests should pass
Instead they currently error as
is_null
always returns false andis_not_null
always returns true:Additional context
This was found by @samuelcolvin on apache/datafusion#11162
The reproducers are from his PR on apache/datafusion#11321
The text was updated successfully, but these errors were encountered: