Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange timeout bug. #265

Closed
jsaak opened this issue Aug 9, 2023 · 10 comments
Closed

Strange timeout bug. #265

jsaak opened this issue Aug 9, 2023 · 10 comments

Comments

@jsaak
Copy link

jsaak commented Aug 9, 2023

I have a class which tries to connect on TCP.
When it fails it tries again.
If i try again with TCPSocket.new() before it time outs, i get a strange error message, which does not make any sense to me:

require 'async'

Fiber.set_scheduler(Async::Scheduler.new)

class Mqtt3NormalExitException < Exception
end

class Mqtt3
  def initialize(host: 'localhost', port: 1883)
    @host = host
    @port = port
    @reconnect = true
  end

  def run
    Fiber.schedule do
      @fiber_main = Fiber.current
      puts "create new fiber #{@fiber_main.inspect}"
      counter = 0
      #counter = 1
      loop do
        begin
          puts "TCPSocket.new()"
          tcp_socket = TCPSocket.new(@host, @port, connect_timeout: 0.9)
          ret = tcp_socket
        rescue => e
          ret = e
        end

        if ret.is_a? (Exception)
          puts("tcp connect error: #{ret.inspect}")
        else
          puts("connected")
          counter = 0
        end

        if @reconnect
          if counter > 0
            begin
              puts "sleeping for #{counter} sec"
              sleep counter
            rescue Mqtt3NormalExitException => e
              @reconnect = false
              break
            end
          end

          if counter == 0
            counter = 1
          else
            counter *= 2
            counter = 300 if counter > 300
          end
        end
        break unless @reconnect
      end
    end
  end

  def stop
    if Fiber.current == @fiber_main
      raise Mqtt3NormalExitException
    else
      puts "sending Mqtt3NormalExitException to #{@fiber_main.inspect}"
      Fiber.scheduler.raise(@fiber_main,Mqtt3NormalExitException)
    end
  end
end

Fiber.schedule do
  loop do
    mqtt = nil

    Fiber.schedule do
      mqtt = Mqtt3.new(host: '1.1.1.1', port: 1111)
      mqtt.run
    end

    Fiber.schedule do
      sleep 3
      mqtt.stop
    end

    sleep 5
  end
end

produces:

$ ruby error.rb
create new fiber #<Fiber:0x00007f9b05aa55a8 /home/jsaak/.gem/ruby/3.2.2/gems/async-2.6.3/lib/async/task.rb:326 (resumed)>
TCPSocket.new()
tcp connect error: #<IO::TimeoutError: connect(2) for "1.1.1.1" port 1111>
TCPSocket.new()
tcp connect error: #<IO::TimeoutError: connect(2) for "1.1.1.1" port 1111>
sleeping for 1 sec
TCPSocket.new()
sending Mqtt3NormalExitException to #<Fiber:0x00007f9b05aa55a8 /home/jsaak/.gem/ruby/3.2.2/gems/async-2.6.3/lib/async/task.rb:326 (suspended)>
/home/jsaak/.gem/ruby/3.2.2/gems/async-2.6.3/lib/async/scheduler.rb:179:in `io_wait': Mqtt3NormalExitException (Mqtt3NormalExitException)
	from /home/jsaak/.gem/ruby/3.2.2/gems/async-2.6.3/lib/async/scheduler.rb:179:in `io_wait'
	from error.rb:23:in `initialize'
	from error.rb:23:in `new'
	from error.rb:23:in `block (2 levels) in run'
	from error.rb:20:in `loop'
	from error.rb:20:in `block in run'
	from /home/jsaak/.gem/ruby/3.2.2/gems/async-2.6.3/lib/async/task.rb:160:in `block in run'
	from /home/jsaak/.gem/ruby/3.2.2/gems/async-2.6.3/lib/async/task.rb:330:in `block in schedule'

$ ruby -v
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]

If i set counter to 1 (sleeping at least 1 sec before retrying, then all is good)

@ioquatix
Copy link
Member

ioquatix commented Aug 9, 2023

Thanks, if possible, are you able to make a MVP failing test case as a PR? This will help me greatly.

@jsaak
Copy link
Author

jsaak commented Aug 9, 2023

I am confused, here is a shorter version:

require 'async'

Fiber.set_scheduler(Async::Scheduler.new)

class MyException < Exception
end

@fiber = Fiber.schedule do
  begin
    sleep 1
  rescue => e
    puts e.inspect
  end
end

Fiber.schedule do
  sleep 0.1
  Fiber.scheduler.raise(@fiber, MyException)
end

Am I doing something wrong?

@kaorihinata
Copy link

The output of the above snippet is:

/Users/thomas/.sandbox/lib/ruby/gems/3.2.0/gems/async-2.6.3/lib/async/scheduler.rb:132:in `transfer': MyException (MyException)
	from /Users/thomas/.sandbox/lib/ruby/gems/3.2.0/gems/async-2.6.3/lib/async/scheduler.rb:132:in `block'
	from /Users/thomas/.sandbox/lib/ruby/gems/3.2.0/gems/async-2.6.3/lib/async/scheduler.rb:154:in `kernel_sleep'
	from -:10:in `sleep'
	from -:10:in `block in <main>'
	from /Users/thomas/.sandbox/lib/ruby/gems/3.2.0/gems/async-2.6.3/lib/async/task.rb:160:in `block in run'
	from /Users/thomas/.sandbox/lib/ruby/gems/3.2.0/gems/async-2.6.3/lib/async/task.rb:330:in `block in schedule'

I can create a PR for this and @ all involved if you'd like. It looks like Fiber.scheduler.raise(@fiber, MyException) passes down to Fiber#raise on @fiber, so you could also probably shorten that line to simply @fiber.raise(MyException) as I believe Fiber#raise is the intended way to raise a Fiber level exception. I've confirmed that the output is the same.

Regarding the issue itself, while the backtrace notes where the Fiber is currently blocking, I'm not sure if an externally raised exception is bound by that scope, so I'm not sure if you're supposed to be able to catch it this way.

@jsaak
Copy link
Author

jsaak commented Aug 9, 2023

To me it seems like a bug. If there is an exception, i need to be able to catch it.
But I do not mind using something different. Fiber.scheduler.raise is a hack anyway.

My goal is to be able to stop a Fiber from another Fiber.
(If possible using the ruby built in methods, to be compatible with other Fiber scheduler implementations)
And I still do not know how to do it. See #263

@kaorihinata
Copy link

kaorihinata commented Aug 9, 2023

It gets weirder, oddly enough. Your example works just fine the moment you remove the custom exception class. It turns out the fact that the caller is a Fiber might not even be relevant.

This cannot be caught:

require 'async'

Fiber.set_scheduler(Async::Scheduler.new)

class CustomException < Exception
end

fibera = Fiber.schedule do
  begin
    sleep 5
  rescue => e
    puts("exception caught")
  end
end

Fiber.scheduler.raise(fibera, CustomException)

This can, and outputs "exception caught":

require 'async'

Fiber.set_scheduler(Async::Scheduler.new)

fibera = Fiber.schedule do
  begin
    sleep 5
  rescue => e
    puts("exception caught")
  end
end

Fiber.scheduler.raise(fibera, RuntimeError)

I'm using Ruby 3.2. I can create the PR if you'd like. Asking because I'm not the OP.

@noteflakes
Copy link

Am I doing something wrong?

A rescue without an explicit exception class will only catch exceptions that are descendants of StandardError. In your example, MyException is a descendant of Exception, which is the lowest-level exception class. To be able to rescue your exception, you can do any of the following:

  • rescue MyException => e - which will rescue only MyException
  • rescue Exception => e - which will rescue all exceptions
  • define the custom exception as class MyException < StandardError

@ioquatix
Copy link
Member

Sorry my family has been sick I have not had time to review this issue.

@kaorihinata
Copy link

@noteflakes

Yep, that was it. Not a bug at all then, just rescue behaving differently than I expected. Tyvm.

@jsaak
Copy link
Author

jsaak commented Aug 15, 2023

Thanks for the solution!
Usually i do not find such surprises using ruby.
Do you know why ruby behaves this way?

@jsaak jsaak closed this as completed Aug 15, 2023
@noteflakes
Copy link

Do you know why ruby behaves this way?

This design allows lower-level exceptions, such as Interrupt and SystemExit (which are raised on SIGINT and SIGTERM signals respectively) to propagate to the top-level and be handled centrally by either custom app code or Ruby's default handlers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants