-
Notifications
You must be signed in to change notification settings - Fork 794
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ListView
& LargeListView
basic construction and validation
#5664
Conversation
ListView
& LargeListView
basic construction and validation
arrow-data/src/data.rs
Outdated
if offset > values_length { | ||
return Err(ArrowError::InvalidArgumentError(format!( | ||
"Size {} at index {} is offset {} is out of bounds for {}", | ||
size, i, offset, self.data_type | ||
))); | ||
} | ||
if size > values_length - offset { | ||
return Err(ArrowError::InvalidArgumentError(format!( | ||
"Size {} at index {} is larger than the remaining values for {}", | ||
size, i, self.data_type | ||
))); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic does not appear to be correct
From https://arrow.apache.org/docs/format/Columnar.html#listview-layout the variants this should be validating are
0 <= offsets[i] <= length of the child array
0 <= offsets[i] + size[i] <= length of the child array
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the offset is already greater than 0, there is no boundary check for 0 here. I have adjusted the order for easier understanding.
arrow-data/src/data.rs
Outdated
size, i, offset, self.data_type | ||
))); | ||
} | ||
if size + offset > values_length { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to be checked addition to handle if size and offset were both i64::MAX
, perhaps we could get a test of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to check if both the offset and size are i64::max here? Are we concerned about exceeding the maximum limit of usize? I don't think either of them should exceed the maximum value of usize, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The addition could overflow
arrow-data/src/data.rs
Outdated
if offset > values_length { | ||
return Err(ArrowError::InvalidArgumentError(format!( | ||
"Size {} at index {} is offset {} is out of bounds for {}", | ||
size, i, offset, self.data_type | ||
))); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can probably drop this check in favor of the one below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is getting close, thank you for sticking with it
Thank you |
Which issue does this PR close?
A part of #5501
Rationale for this change
What changes are included in this PR?
Add ListView & LargeListView basic ArrayData construction and validation
Are there any user-facing changes?