`system` hangs if file does not exist #351

cretz · 2024-11-04T17:37:23Z

If you run system inside Async with a non-existent executable on Linux with Ruby 3.3 it hangs. This behavior is confirmed in 2.17.0 and 2.18.0 (I didn't check any earlier than that).

This synchronous code works:

puts "Running on: #{RbConfig::CONFIG['host_os']}"
res = system 'echo', 'hello'
puts "Result 1: #{res}"
res = system 'does-not-exist', 'hello'
puts "Result 2: #{res}"

And outputs:

Running on: linux
hello
Result 1: true
Result 2:

But this code does not:

require 'async'

Async do
  puts "Running on: #{RbConfig::CONFIG['host_os']}"
  res = system 'echo', 'hello'
  puts "Result 1: #{res}"
  res = system 'does-not-exist', 'hello'
  puts "Result 2: #{res}"
end

This hangs on the second system call. This was discovered after a deep dive of why Minitest assert_equal was hanging. It turns out, Minitest uses diff on the system when reporting an assert_equal failure, but first checks for gdiff which we don't have, ref: https://github.com/minitest/minitest/blob/e4417c5b13ab917a60c218700041152136bc07c8/lib/minitest/assertions.rb#L33. So we have to Minitest::Assertions.diff = 'diff' to get Async-based tests to not break here.

(sorry if this is already reported, I could not find an issue)

The text was updated successfully, but these errors were encountered:

ioquatix · 2024-11-04T19:53:39Z

Nice find. I didn't know about this! haha.

I'll investigate the issue today.

ioquatix · 2024-11-05T00:12:17Z

I could not reproduce from the given example.

samuel@aiko ~/D/s/async (main)> ruby --version
ruby 3.3.4 (2024-07-09 revision be1089c8ec) [x86_64-linux]
samuel@aiko ~/D/s/async (main)> cat ./test.rb
#!/usr/bin/env ruby

require "async"

Async do
	puts "Running on: #{RbConfig::CONFIG['host_os']}"
	res = system 'echo', 'hello'
	puts "Result 1: #{res.inspect}"
	res = system 'does-not-exist', 'hello'
	puts "Result 2: #{res.inspect}"
end
samuel@aiko ~/D/s/async (main)> bundle exec ./test.rb
Running on: linux
hello
Result 1: true
Result 2: nil

Can you confirm I've got the correct reproduction script?

ioquatix · 2024-11-05T00:24:04Z

I tried to reproduce the original error but couldn't:

#!/usr/bin/env ruby

require "minitest/autorun"
require "async"

class Test < Minitest::Test
	BIG_TEXT_A = "The quick brown fox jumps over the lazy dog.\n" * 10
	BIG_TEXT_B = BIG_TEXT_A.dup.sub("quick", "slow")
	
	def test_diff
		Async do |task|
			assert_equal(BIG_TEXT_A, BIG_TEXT_B)
		end
	end
end

I added some instrumentation to the scheduler process_wait hook:

		# Wait for the specified process ID to exit.
		# @parameter pid [Integer] The process ID to wait for.
		# @parameter flags [Integer] A bit-mask of flags suitable for `Process::Status.wait`.
		# @returns [Process::Status] A process status instance.
		# @asynchronous May be non-blocking..
		def process_wait(pid, flags)
			return @selector.process_wait(Fiber.current, pid, flags).tap{|result|
				Fiber.blocking{$stderr.puts "process_wait(#{pid}, #{flags}) -> #{result.inspect}"}
			}
		end

It appears to work okay:

> bundle exec ./test.rb
Run options: --seed 20496

# Running:

process_wait(927854, 0) -> #<Process::Status: pid 927854 exit 127>
process_wait(927855, 0) -> #<Process::Status: pid 927855 exit 0>
process_wait(927856, 0) -> #<Process::Status: pid 927856 exit 1>
F

Finished in 0.002793s, 358.0683 runs/s, 358.0683 assertions/s.

  1) Failure:
Test#test_diff [test.rb:12]:
--- expected
+++ actual
@@ -1,4 +1,4 @@
-"The quick brown fox jumps over the lazy dog.
+"The slow brown fox jumps over the lazy dog.
 The quick brown fox jumps over the lazy dog.
 The quick brown fox jumps over the lazy dog.
 The quick brown fox jumps over the lazy dog.


1 runs, 1 assertions, 1 failures, 0 errors, 0 skips

ioquatix · 2024-11-05T00:32:13Z

Is it possible something in your test suite is monkey patching Kernel::system?

See <#351> for context.

cretz · 2024-11-05T13:39:16Z

Interesting, this is breaking for me:

cretz@DESKTOP:~/work/temp$ cat ./test.rb
require 'async'

Async do
  puts "Running on: #{RbConfig::CONFIG['host_os']}"
  res = system 'echo', 'hello'
  puts "Result 1: #{res}"
  res = system 'does-not-exist', 'hello'
  puts "Result 2: #{res}"
end
cretz@DESKTOP:~/work/temp$ bundle exec ruby --version
ruby 3.3.3 (2024-06-12 revision f1c7b6f435) [x86_64-linux]
cretz@DESKTOP:~/work/temp$ bundle exec ruby ./test.rb
Running on: linux
hello
Result 1: true

It hangs right there. I am not doing any monkey patching. This was also hanging for me inside of a GitHub action for Ubuntu and Ruby 3.3 (but not 3.1 I don't think). I will do some more investigation, possibly including a simple repository that replicates the issue with a simple GitHub action, and get back. (may be a bit, I am traveling)

olleolleolle · 2024-11-07T18:48:57Z

I was also able to make it hang on that.

# issue-351.rb
require 'bundler/inline'

gemfile do
  source 'https://rubygems.org'
  gem 'async'
end

Async do
  puts "Running on: #{RbConfig::CONFIG['host_os']}"
  res = system 'echo', 'hello'
  puts "Result 1: #{res}"
  res = system 'does-not-exist', 'hello'
  puts "Result 2: #{res}"
end

My output was: note where I hit Ctrl-C and it got output, and then the first word Result 2:

$ ruby issue-351.rb
Running on: linux
hello
Result 1: true
^CResult 2:
/home/olle/projects/slask/tmp/devbox/gems/gems/async-2.18.0/lib/async/scheduler.rb:317:in `select': Interrupt
        from /home/olle/projects/slask/tmp/devbox/gems/gems/async-2.18.0/lib/async/scheduler.rb:317:in `run_once!'
        from /home/olle/projects/slask/tmp/devbox/gems/gems/async-2.18.0/lib/async/scheduler.rb:353:in `run_once'
        from /home/olle/projects/slask/tmp/devbox/gems/gems/async-2.18.0/lib/async/scheduler.rb:408:in `block in run'
        from /home/olle/projects/slask/tmp/devbox/gems/gems/async-2.18.0/lib/async/scheduler.rb:386:in `block in run_loop'
        from /home/olle/projects/slask/tmp/devbox/gems/gems/async-2.18.0/lib/async/scheduler.rb:383:in `handle_interrupt'
        from /home/olle/projects/slask/tmp/devbox/gems/gems/async-2.18.0/lib/async/scheduler.rb:383:in `run_loop'
        from /home/olle/projects/slask/tmp/devbox/gems/gems/async-2.18.0/lib/async/scheduler.rb:407:in `run'
        from /home/olle/projects/slask/tmp/devbox/gems/gems/async-2.18.0/lib/kernel/async.rb:34:in `Async'
        from issue-351.rb:8:in `<main>'

See Bundler inline docs.

ioquatix · 2024-11-08T00:14:35Z

lol, I cannot reproduce, I tried IO_EVENT_SELECTOR=Select too.

Okay, can you please tell me:

uname -a
ruby -v
Your shell and version of shell.

ioquatix · 2024-11-08T00:16:32Z

For me it's:

samuel@aiko ~/D/tmp> uname -a
Linux aiko 6.11.6-arch1-1 #1 SMP PREEMPT_DYNAMIC Fri, 01 Nov 2024 03:30:41 +0000 x86_64 GNU/Linux
samuel@aiko ~/D/tmp> ruby -v
ruby 3.3.3 (2024-06-12 revision f1c7b6f435) [x86_64-linux]
samuel@aiko ~/D/tmp> $SHELL -v
fish, version 3.7.1

ioquatix · 2024-11-08T01:29:47Z

In Async v2.19, I've introduced more debug output: https://socketry.github.io/async/releases/index#async::scheduler-debugging

Can one of you run the hanging example with CONSOLE_LEVEL=debug environment variable set, and show me the full output?

If it's not too verbose, it might also be useful to see the output generated by RUBYOPT=-d ....

Thanks.

ioquatix · 2024-11-11T07:48:09Z

This appears to be a problem in Ruby process.c which can fork with an active scheduler without cleaning up correctly. I think at best we can fix this in Ruby 3.4 and maybe backport the fix. I'll need to investigate further.

ioquatix · 2024-11-11T08:13:35Z

See ruby/ruby#12052 for a possible fix.

cretz added a commit to cretz/temporal-sdk-ruby that referenced this issue Nov 4, 2024

Fix issue with async and minitest, see socketry/async#351

5bca3ca

ioquatix added a commit that referenced this issue Nov 5, 2024

Add test for non-existant executable.

9639048

See <#351> for context.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`system` hangs if file does not exist #351

`system` hangs if file does not exist #351

cretz commented Nov 4, 2024 •

edited

Loading

ioquatix commented Nov 4, 2024

ioquatix commented Nov 5, 2024

ioquatix commented Nov 5, 2024

ioquatix commented Nov 5, 2024

cretz commented Nov 5, 2024 •

edited

Loading

olleolleolle commented Nov 7, 2024 •

edited

Loading

ioquatix commented Nov 8, 2024 •

edited

Loading

ioquatix commented Nov 8, 2024

ioquatix commented Nov 8, 2024 •

edited

Loading

ioquatix commented Nov 11, 2024

ioquatix commented Nov 11, 2024

system hangs if file does not exist #351

system hangs if file does not exist #351

Comments

cretz commented Nov 4, 2024 • edited Loading

ioquatix commented Nov 4, 2024

ioquatix commented Nov 5, 2024

ioquatix commented Nov 5, 2024

ioquatix commented Nov 5, 2024

cretz commented Nov 5, 2024 • edited Loading

olleolleolle commented Nov 7, 2024 • edited Loading

ioquatix commented Nov 8, 2024 • edited Loading

ioquatix commented Nov 8, 2024

ioquatix commented Nov 8, 2024 • edited Loading

ioquatix commented Nov 11, 2024

ioquatix commented Nov 11, 2024

`system` hangs if file does not exist #351

`system` hangs if file does not exist #351

cretz commented Nov 4, 2024 •

edited

Loading

cretz commented Nov 5, 2024 •

edited

Loading

olleolleolle commented Nov 7, 2024 •

edited

Loading

ioquatix commented Nov 8, 2024 •

edited

Loading

ioquatix commented Nov 8, 2024 •

edited

Loading