Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FasterPath.blank? panics if string has invalid encoding #67

Closed
dubek opened this issue Jun 20, 2016 · 10 comments
Closed

FasterPath.blank? panics if string has invalid encoding #67

dubek opened this issue Jun 20, 2016 · 10 comments
Milestone

Comments

@dubek
Copy link

dubek commented Jun 20, 2016

To reproduce the crash:

ruby -rfaster_path -e 'FasterPath.blank?("\xff")'

The issue is the unwrap() call in src/is_blank.rs.

The expected behaviour is that it'll raise an ArgumentError exception ("invalid byte sequence in UTF-8"), like Ruby's String#match in "\xff".match(/ /)

@danielpclark
Copy link
Owner

danielpclark commented Jun 20, 2016

Thanks for this. I believe we need to implement a way to raise Ruby exceptions directly from Rust. I am concerned about the performance hit wrapping the response in an Ok, Err thing and the implementation detail of Ruby expecting something specific back with FFI.

@dubek
Copy link
Author

dubek commented Jun 20, 2016

I have no experience with this, but you can have the rust code return:

  • 0 = input is not blank (--> Ruby false)
  • 1 = input is blank (--> Ruby true)
  • 2 = input is invalid (--> Ruby exception)

And have a small if in Ruby code to check for the result and raise exception. But this might have a performance penalty.

@danielpclark
Copy link
Owner

danielpclark commented Jun 20, 2016

Rust requires a specific type returned for any method. If you need more then one type you build a higher type which will encapsulate the kinds of things you want to return. But by doing this very thing you lose efficiency in packaging an unpacking the types.

Using if checks in Ruby will never outperform the C code. Thankfully though you're talking about the blank? method which has no counterpart in Ruby to compare against so if we make it slower in the process we don't have to be better than something else.

There are other ways is_blank can be implemented which may handle that input. I'll have to look into it.

@danielpclark
Copy link
Owner

I've noticed that Pathname.new accepts that as a parameter just fine. So I believe the issue is only to not panic in Rust.

@danielpclark
Copy link
Owner

@dubek Looking at Ruby's FFI spec custom_type_spec.rb it appears we can implement the proper returns like you've outlined.

@glebm
Copy link
Contributor

glebm commented Jun 21, 2016

Can we not call rb_raise directly from rust code?

@danielpclark
Copy link
Owner

danielpclark commented Jun 21, 2016

Rust allows only 1 type to be returned and FFI even more so expects one type to be returned. To model multiple types we have to create our own type, which will wrap both types, which we can do in both Rust and FFI with Structs.

But then thinking about how to handle the struct in Ruby we might be adding something like raise struct.error if struct.error otherwise return struct.value . That Ruby conditional will likely cost us lots in performance

I'm not sure how deeply coupled the FFI lib is within the code that is written in itself. For example we can do callbacks which lets us evaluate a block in another call to "Rust" but still has the original method returning a type... which should be of one type of return.

I think the issue really comes down to how much we can cheat the one return type design and how much FFI will let us conditionally return that value without higher level Ruby code evaluating that condition.

@danielpclark
Copy link
Owner

Found this in Ruby's own test suite

filename = "\xff".force_encoding("UTF-8") # invalid byte sequence as UTF-8

https://github.com/ruby/ruby/blob/b67ead14521fb74bcf8ec28f8c78245dfb536b70/test/ruby/test_dir_m17n.rb#L72

I'd like to try it on the parameter handed to Rust and see if it does anything.

@danielpclark
Copy link
Owner

Now that this code base has been rewritten to ruru the issue lies in the implementation of RString: d-unsed/ruru#73

@danielpclark
Copy link
Owner

danielpclark commented Sep 12, 2017

This information is useful still for any inputs for Rust implemented methods. I'm closing this issue because I've removed the Rust implementation of blank? and put Ruby code in as I was unable to write any Rust code that was faster than str.strip.empty?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants