You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
The Arrow C data interface spec states, that the null_count field of an ArrowArray structure instance "MAY be -1 if not yet computed" .
Currently the arrow-array crate's ffi always treats this field as unsigned and initialized, though this assumption can be
false, if the arrow C data interface specification is strictly followed. Because of that one can get all sorts of nasty bugs
when working with arrays coming from C.
To Reproduce
Construct an instance of an FFI_ArrowArray and set its null_count field to -1.
Convert the above instance into an ArrayData instance with the help of the from_ffi
(or from_ffi_and_data_type) function.
Call the null_count() method on the resulting ArrayData instance (calls into NullBuffer's null_count()).
The result is a usize::MAX due to conversion from signed to unsigned.
Expected behavior
If the null_count field of an FFI_ArrowArray instance is -1 (uninitialized):
- Either initialize the null_count field of a NullBuffer instance during ffi conversion by inspecting the null buffer itself,
- or stick to the arrow C data interface spec, by making the null_count field of theNullBuffer an Option and adding
support for lazy initialization during the call to its null_count() method.
The text was updated successfully, but these errors were encountered:
Computing the null count if not provided makes sense to me, various kernels rely on this being precomputed to efficiently perform kernel selection and so changing this would be highly disruptive
Describe the bug
The Arrow C data interface spec states, that the
null_count
field of anArrowArray
structure instance"MAY be -1 if not yet computed" .
Currently the arrow-array crate's ffi always treats this field as unsigned and initialized, though this assumption can be
false, if the arrow C data interface specification is strictly followed. Because of that one can get all sorts of nasty bugs
when working with arrays coming from C.
To Reproduce
FFI_ArrowArray
and set itsnull_count
field to -1.ArrayData
instance with the help of thefrom_ffi
(or
from_ffi_and_data_type
) function.null_count()
method on the resultingArrayData
instance (calls intoNullBuffer
'snull_count()
).Expected behavior
If the
null_count
field of anFFI_ArrowArray
instance is -1 (uninitialized):- Either initialize the
null_count
field of aNullBuffer
instance during ffi conversion by inspecting the null buffer itself,- or stick to the arrow C data interface spec, by making the
null_count
field of theNullBuffer
anOption
and addingsupport for lazy initialization during the call to its
null_count()
method.The text was updated successfully, but these errors were encountered: