-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add buffer size, replication, blocksize as file open options (#135) #136
Conversation
as per issue #135 |
Seems check failed on the latest rust. Would you like to start a new PR for that? |
@Xuanwo yeah sure |
@Xuanwo here you go |
src/open_options.rs
Outdated
/// Pass `0` if you want to use the default configured values. | ||
/// | ||
/// `0` by default. | ||
pub fn with_buffer_size(&mut self, buffer_size: c_int) -> &mut Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, how about using rust's own types and convert into c_int
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a split about what primitive Rust type it really means in different platforms. "avr" and "msp430" say they are i16
but the rest of the platforms mean i32
. I'm not confident if we should choose the strictest route to use i16
or follow i32
which is the popular one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My current idea is to use usize
here and return an error during opening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please keep in mind that hadoop itself may not be able to run on avr
or msp430
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yeah you're right ill start working on it
src/open_options.rs
Outdated
/// Pass `0` if you want to use the default configured values. | ||
/// | ||
/// `0` by default. | ||
pub fn with_replication(&mut self, replication: c_short) -> &mut Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c_short
is currently uniquely points to i16
for all platforms. But in case there would come with a new platform in the future and it would refer to another type otherwise, I think we should just stick to c_short
for the maximum compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in #136 (comment), using ffi
in the public API of a high-level crate like hdrs
doesn't sit well with me.
src/open_options.rs
Outdated
/// Pass `0` if you want to use the default configured values. | ||
/// | ||
/// `0` by default. | ||
pub fn with_blocksize(&mut self, blocksize: i32) -> &mut Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will happen if blocksize < 0
? What doesn 0
mean, the blocksize in hdfs config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strictly speaking, the type should be u31
according to the valid values in the "Block Size" section. 0
is defined to be a default in C API. Using the web API as a cross-reference, 0
means "Specified in the configuration".
Since there isn't a u31
available, using u16
seems too restrictive while u32
straight up welcoming invalid values, I suggest we just do it in the C API way and have users informed by Error::last_os_error()
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... if users pass in any invalid values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can verify those things while calling open
.
I have revised the integer types and changed the three all to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect! Thank you @Banyc for working on this.
Previously,
OpenOptions
did not provide a way for user to configure HDFS-specific options for its file open API. Instead, they are left with default values. According to the comment in code, it is intended to be implemented in the future.After the change, users are allowed to specify those parameters in a
with
-prefixed builder pattern fashion.