Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout while running timeout tests #3652

Open
eregon opened this issue Aug 28, 2024 · 2 comments
Open

Timeout while running timeout tests #3652

eregon opened this issue Aug 28, 2024 · 2 comments

Comments

@eregon
Copy link
Member

eregon commented Aug 28, 2024

👋 Hi! I recently looked at the CI run on the upstream repo. The job truffleruby-head on ubuntu-latest took a while (>26 minutes) to do anything in the step that executes the tests. https://github.com/ruby/timeout/actions/runs/10595331126/job/29360878189

The step is...
bundle exec rake test

Got ideas on what's going on? The macos-latest seem to be done in 5s.

(If you wish to shunt this Issue comment to its own something, Discussion, Issue, comment thread in the ruby/timeout repo, please shunt away.)

Originally posted by @olleolleolle in #3426 (comment)
Ref: ruby/timeout#50 (comment)

@eregon
Copy link
Member Author

eregon commented Aug 28, 2024

Not sure, no, there is definitely something hanging there.
The only output we see is:

2024-08-28T10:58:52.7217665Z ##[group]Run bundle exec rake test
2024-08-28T10:58:52.7218348Z �[36;1mbundle exec rake test�[0m
2024-08-28T10:58:52.7393963Z shell: /usr/bin/bash -e {0}
2024-08-28T10:58:52.7394516Z ##[endgroup]
2024-08-28T12:47:11.7829453Z ##[error]The operation was canceled.
2024-08-28T12:47:11.7912328Z Post job cleanup.

i.e. no output at all.
Unfortunately GitHub Actions when they are cancelled or timeout seem to use SIGKILL so there not even an opportunity to flush any IO buffers (though on TruffleRuby stdout & stderr are unbuffered so at least that shouldn't be a problem), print a stacktrace, or anything like that.

It seems somewhat transient yet frequent, looking at https://github.com/ruby/timeout/actions
BTW it also hapenned on the 24.0.2 release in https://github.com/ruby/timeout/actions/runs/10594875253/job/29359440681

@eregon
Copy link
Member Author

eregon commented Aug 28, 2024

I ran it 10 times locally on Linux and it succeeded each time and took between 5 and 10 seconds.
So it seems hard to reproduce.

I think the next step would be to find or create a "soft timeout utility" for GitHub Actions.
I.e. some program that runs a given shell command with a given timeout and when the timeout is reached sends SIGQUIT (which shows Java stacktraces on TruffleRuby), waits a couple seconds, then send SIGALRM (which shows Ruby stacktraces on TruffleRuby), wait a couple seconds and then send SIGKILL.
It could be just a small Ruby script or in any other language.
FWIW mspec already does something like that in https://github.com/ruby/mspec/blob/5354a405e327bb144f00157b913aa544c3b77e4b/lib/mspec/runner/actions/timeout.rb#L110-L144

(Maybe this could also be available as a command-line option of TruffleRuby for convenience, although that would very likely need to create an extra Ruby Thread internally to handle this which doesn't seem ideal.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant