-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(sequencer-relayer): provide a shutdown controller #889
feat(sequencer-relayer): provide a shutdown controller #889
Conversation
timeout(Duration::from_secs(30), sequencer_relayer) | ||
.await | ||
.unwrap_or_else(|_| { | ||
panic!("timed out waiting for sequencer relayer to shut down"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generally avoid putting panicking code in drop impls, but I think it's probably ok here since this is a test-only object, and having a test process abort seems like a better option than just logging here since the log message could easily be missed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this panic here makes sense as that means the shutdown logic failed, which we'd otherwise not catch.
There is also std::thread::panicking
to check if the thread is ... panicking. You could issue a debug!
in that case so it doesn't re-panick.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, that's what I was meaning in my comment. I think it's better to not check for std::thread::panicking
here since all we'd win is avoiding the test process aborting. But if the thread is already panicking and all we do here is log an error, then I think it'll be easy to miss that error.
@@ -305,6 +326,12 @@ pub struct TestSequencerRelayerConfig { | |||
|
|||
impl TestSequencerRelayerConfig { | |||
pub async fn spawn_relayer(self) -> TestSequencerRelayer { | |||
assert_ne!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need the tokio runtime to not be of the "current thread" flavour so as to allow the TestSequencerRelayer::drop
to not hang at the futures::executor::block_on
call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting. The wiremock mock servers are fine running on current_thread
. Do you know where this might be deadlocking? I wonder if we should try and address this problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it's not a problem with code currently in the main
branch. It's the addition in this PR of the call to futures::executor::block_on
in the TestSequencerRelayer::drop
.
The async function passed to block_on
can't be cancelled or paused, and I believe that's stopping the relayer making progress inside the main select!
macro as the single tokio thread is blocked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like explicitly handling the shutdowns very much so that we can test this on every test.
I think this is almost ready to merge, but I would like to explore whether we can both simplify the ShutdownHandle
(by moving the signal trap into main.rs
and relying on a oneshot
channel or a CancellationToken
/DropGuard
) and making it more explicit (by providing an ShutdownHandle::shutdown
method).
timeout(Duration::from_secs(30), sequencer_relayer) | ||
.await | ||
.unwrap_or_else(|_| { | ||
panic!("timed out waiting for sequencer relayer to shut down"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this panic here makes sense as that means the shutdown logic failed, which we'd otherwise not catch.
There is also std::thread::panicking
to check if the thread is ... panicking. You could issue a debug!
in that case so it doesn't re-panick.
@@ -305,6 +326,12 @@ pub struct TestSequencerRelayerConfig { | |||
|
|||
impl TestSequencerRelayerConfig { | |||
pub async fn spawn_relayer(self) -> TestSequencerRelayer { | |||
assert_ne!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting. The wiremock mock servers are fine running on current_thread
. Do you know where this might be deadlocking? I wonder if we should try and address this problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is neat, thank you!
…#919) ## Summary Added the `--all-targets` flag to `cargo hack check` to ensure tests can also be built outside the workspace. ## Background While implementing #889 it was found that sequencer-relayer tests can only be run from the repository root but not from the crate directory itself. The CI job that is supposed to ensure this failed because it was not running with the right flags. ## Changes - Added `--all-targets` flag to `cargo hack check` in the rust test github workflow. - ## Related Issues Closes #893
## Summary Conductor now respects shutdown signals it receives during init. ## Background Conductor's task ignored shutdowns while still initializing. This meant that Conductor would hang for up to 30 seconds. ## Changes - refactor conductor's constituent long-running tasks to separate initialization and running - listen for the shutdown signal in all of conductor's tasks ## Testing Run conductor with endpoints that hang indefinitely and sending it SIGTERM. Observe that conductor shuts down quickly. The main operation of conductor is unaffected on the happy path: all blackbox tests run to completion. A proper test for the shutdown logic will be implemented in a follow-up refactor similar to #889
Summary
Adds a shutdown handle for the sequencer-relayer.
Background
We want the ability to invoke the shutdown sequence from
main
and from tests.Changes
A new RAII object
ShutdownHandle
has been added. This cancels the wrappedCancellationToken
when theShutdownHandle
is dropped or when itsshutdown
method is called.SIGTERM
handling has been moved to main.rs; on receiving aSIGTERM
,shutdown
is called on the shutdown handle associated with the relayer.The relayer's main
select!
loop was simplified to only contain the two tasks.Testing
The
TestSequencerRelayer
was updated to make use of the new shutdown controller. Also manually tested by sending aSIGTERM
to a running sequencer-relayer.Related Issues
Closes #882