You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
When reading a Parquet file with FIXED_LEN_BYTE_ARRAY columns with nulls present one necessary operation is moving the fixed-length data into the correct location within the output buffer to take into account null slots. This is handled by the pad_nulls function in the ValuesBuffer trait. The inner loop of this function
for i in0..byte_length {self.buffer[level_pos_bytes + i] = self.buffer[value_pos_bytes + i]}
works well when the fixed width is low (<= 4), but for larger widths this loop is quite inefficient.
Describe the solution you'd like
Rewriting the inner loop for longer fixed-size arrays can speed this operation up considerably. In particular, by copying slices of the buffer to another location in the buffer, the compiler can vectorize the move, e.g.
let split = self.buffer.split_at_mut(level_pos_bytes);let dst = &mut split.1[..byte_length];let src = &split.0[value_pos_bytes..value_pos_bytes + byte_length];for i in0..byte_length {
dst[i] = src[i]}
Describe alternatives you've considered
I tried Vec::copy_within but it was slower than the vectorized copy.
Additional context
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
When reading a Parquet file with FIXED_LEN_BYTE_ARRAY columns with nulls present one necessary operation is moving the fixed-length data into the correct location within the output buffer to take into account null slots. This is handled by the
pad_nulls
function in theValuesBuffer
trait. The inner loop of this functionworks well when the fixed width is low (
<= 4
), but for larger widths this loop is quite inefficient.Describe the solution you'd like
Rewriting the inner loop for longer fixed-size arrays can speed this operation up considerably. In particular, by copying slices of the buffer to another location in the buffer, the compiler can vectorize the move, e.g.
Describe alternatives you've considered
I tried
Vec::copy_within
but it was slower than the vectorized copy.Additional context
The text was updated successfully, but these errors were encountered: