[rb] Handle graceful webdriver shutdown #14430

josegomezr · 2024-08-23T14:58:38Z

User description

Improve the shutdown sequence by separating a dead child pid (not running anymore) from a running child pid.

Description

When shutting down Selenium::WebDriver::ServiceManager#stop_server issues a /shutdown request against the webdriver server and the server exits cleanly, however the mechanism to assert if the child process is up or not cannot distinguish between busy or not found.

Selenium::WebDriver::ChildProcess#exited? returns false when the process is still alive because
Selenium::WebDriver::ChildProcess#waitpid2 returns nil.

However, by catching Errno::ECHILD and Errno::ESRCH and returning nil #waitpid2 masks a not running pid as "running" delaying the shutdown procedure when the server was already gone.

This patch moves the exception handling away from the Process wrappers so its closer to the decision points in
Selenium::WebDriver::ChildProcess#exited? and
Selenium::WebDriver::ChildProcess#stop.

Addresses a similar problem as in #14032. With this patch the shutdown sequence looks like:

Send a /shutdown signal to the server [if supported]
- Check for the server pid
  - If it cannot be found (Errno::ECHILD, Errno::ESRCH), then assume the server has shutdown.
  - else: wait and retry in ~100 ms.
- repeat for 20 seconds (Selenium::WebDriver::ServiceManager::STOP_TIMEOUT).
Try to send a TERM signal to the server.
- Check for the server pid
  - If it cannot be found (Errno::ECHILD, Errno::ESRCH), then assume the server has shutdown.
  - else: wait and retry in ~100 ms.
- repeat for 3 seconds (default value of timeout inSelenium::WebDriver::ChildProcess#stop ).
Try to send a KILL signal to the server.
- Check for the server pid
  - If it cannot be found (Errno::ECHILD, Errno::ESRCH), then assume the server has shutdown.
  - else: wait and retry in ~100 ms.

Motivation and Context

Noticed that the tests will end, but it'll be hung for 10~20 seconds doing nothing. This patch reduces my test run time from ~32s to ~13s consistently.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist

I have read the contributing document.
My change requires a change to the documentation.
I have updated the documentation accordingly.
I have added tests to cover my changes.
All new and existing tests passed.

Logs of a production application

with selenium-webdrivers 4.23.0

time RAILS_ENV=test bex teaspoon
# [...]
2024-08-23 16:16:10 DEBUG Selenium [:process] Starting process: ["/usr/bin/chromedriver", "--port=9515"] with {[:out, :err]=>#<IO:<STDERR>>, :pgroup=>true} 
2024-08-23 16:16:10 DEBUG Selenium [:process]   -> pid: 27757 
# [...]
........................................................................................................................................

Finished in 1.78500 seconds
136 examples, 0 failures
2024-08-23 16:16:18 DEBUG Selenium [:command] -> DELETE session/60ed9bb863325bb8238f2cc86bee81c6 
2024-08-23 16:16:19 DEBUG Selenium [:header]    <<<  {"content-length"=>["14"], "content-type"=>["application/json; charset=utf-8"], "cache-control"=>["no-cache"]} 
2024-08-23 16:16:19 DEBUG Selenium [:command] <- {"value":null} 
2024-08-23 16:16:19 DEBUG Selenium [:process] Checking if 27757 is exited: 
2024-08-23 16:16:19 DEBUG Selenium [:process] Polling 20 seconds for exit of 27757 
2024-08-23 16:16:19 DEBUG Selenium [:process] Checking if 27757 is exited: 
2024-08-23 16:16:19 DEBUG Selenium [:process] Checking if 27757 is exited: 


real    0m33.907s
user    0m6.563s
sys     0m1.045s

with this patch:

2024-08-23 16:17:55 DEBUG Selenium [:process] Starting process: ["/usr/bin/chromedriver", "--port=9515"] with {[:out, :err]=>#<IO:<STDERR>>, :pgroup=>true} 
2024-08-23 16:17:55 DEBUG Selenium [:process]   -> pid: 28825 

........................................................................................................................................

Finished in 1.54100 seconds
136 examples, 0 failures
2024-08-23 16:18:03 DEBUG Selenium [:command] -> DELETE session/93d5a7d278d25585438a62a60a0625ca 
2024-08-23 16:18:03 DEBUG Selenium [:header]    <<<  {"content-length"=>["14"], "content-type"=>["application/json; charset=utf-8"], "cache-control"=>["no-cache"]} 
2024-08-23 16:18:03 DEBUG Selenium [:command] <- {"value":null} 
2024-08-23 16:18:03 DEBUG Selenium [:process] Checking if 28825 is exited: 
2024-08-23 16:18:03 DEBUG Selenium [:driver_service] Sending shutdown request to server 
2024-08-23 16:18:03 DEBUG Selenium [:process] Polling 20 seconds for exit of 28825 
2024-08-23 16:18:03 DEBUG Selenium [:process] Checking if 28825 is exited: 
2024-08-23 16:18:03 DEBUG Selenium [:process] Checking if 28825 is exited: 
2024-08-23 16:18:03 DEBUG Selenium [:process]   -> process: 28825 already finished 
2024-08-23 16:18:03 DEBUG Selenium [:process] Checking if 28825 is exited: 
2024-08-23 16:18:03 DEBUG Selenium [:process]   -> process: 28825 already finished 
2024-08-23 16:18:03 DEBUG Selenium [:process] Checking if 28825 is exited: 
2024-08-23 16:18:03 DEBUG Selenium [:process]   -> process: 28825 already finished 

2024-08-23 16:18:03 DEBUG Selenium [:process] Checking if 28825 is exited: 
2024-08-23 16:18:03 DEBUG Selenium [:process]   -> process: 28825 already finished 
2024-08-23 16:18:03 DEBUG Selenium [:process] Checking if 28825 is exited: 
2024-08-23 16:18:03 DEBUG Selenium [:process]   -> process: 28825 already finished 

real    0m12.858s
user    0m6.254s
sys     0m1.049s

PR Type

enhancement, tests

Description

Refactored the process termination logic in ChildProcess to improve clarity and error handling.
Introduced a new method terminate_and_wait_else_kill to encapsulate termination logic.
Enhanced logging for process termination and server shutdown requests to aid in debugging.
Improved handling of Errno::ECHILD and Errno::ESRCH exceptions to correctly identify exited processes.

Changes walkthrough 📝

Relevant files

Enhancement

child_process.rb `Refactor and enhance process termination and exit handling` rb/lib/selenium/webdriver/common/child_process.rb Refactored process termination logic into a new method `terminate_and_wait_else_kill`. Added debug logging for process termination and exit checks. Improved error handling for process exit checks with specific exceptions.	+20/-14
service_manager.rb `Add debug logging for server shutdown requests` rb/lib/selenium/webdriver/common/service_manager.rb Added debug logging for server shutdown requests.	+1/-0

💡 PR-Agent usage:
Comment /help on the PR to get a list of all available PR-Agent tools and their descriptions

When shutting down `Selenium::WebDriver::ServiceManager#stop_server` issues a `/shutdown` request against the webdriver server and the server exits cleanly, however the mechanism to assert if the child process is up or not cannot distinguish between busy or not found. `Selenium::WebDriver::ChildProcess#exited?` returns `false` when the process is still alive because `Selenium::WebDriver::ChildProcess#waitpid2` returns `nil`. However, by catching `Errno::ECHILD` and `Errno::ESRCH` and returning `nil` `#waitpid2` masks a not running pid as "running" delaying the shutdown procedure when the server was already gone. This patch moves the exception handling away from the `Process` wrappers so its closer to the decision points in `Selenium::WebDriver::ChildProcess#exited?` and `Selenium::WebDriver::ChildProcess#stop`. Addresses a similar inconsistency in SeleniumHQ#14032

CLAassistant · 2024-08-23T14:58:45Z

All committers have signed the CLA.

codiumai-pr-agent-pro · 2024-08-23T14:59:03Z

PR Reviewer Guide 🔍

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Key issues to review Error Handling The new error handling in the `stop` method might mask other potential errors. Consider adding a more specific rescue clause or logging the error details. Code Duplication The `terminate_and_wait_else_kill` method contains logic that was previously in the `stop` method. Consider if this abstraction is necessary or if it adds unnecessary complexity.

codiumai-pr-agent-pro · 2024-08-23T14:59:25Z

PR Code Suggestions ✨

Category	Suggestion	Score
Error handling	Add error handling for the server shutdown HTTP request In the `stop_server` method, consider adding error handling for the HTTP request. This can help manage potential network issues or server errors during shutdown. rb/lib/selenium/webdriver/common/service_manager.rb [109-115] def stop_server connect_to_server do \|http\| headers = WebDriver::Remote::Http::Common::DEFAULT_HEADERS.dup WebDriver.logger.debug("Sending shutdown request to server", id: :driver_service) - http.get('/shutdown', headers) + begin + http.get('/shutdown', headers) + rescue StandardError => e + WebDriver.logger.warn("Error during server shutdown: #{e.message}", id: :driver_service) + end end end Apply this suggestion Suggestion importance[1-10]: 8 Why: Implementing error handling for the HTTP request in `stop_server` is a significant improvement as it addresses potential network issues or server errors, enhancing the robustness of the code.	8
Enhancement	Enhance error logging by including the specific exception type In the `exited?` method, consider logging the specific exception type when catching `Errno::ECHILD` or `Errno::ESRCH`. This can provide more detailed debugging information. rb/lib/selenium/webdriver/common/child_process.rb [86-88] -rescue Errno::ECHILD, Errno::ESRCH - WebDriver.logger.debug(" -> process: #{@pid} already finished", id: :process) +rescue Errno::ECHILD, Errno::ESRCH => e + WebDriver.logger.debug(" -> process: #{@pid} already finished (#{e.class.name})", id: :process) return true Apply this suggestion Suggestion importance[1-10]: 7 Why: Adding the specific exception type to the log message enhances debugging and provides more context, which is beneficial for maintenance and troubleshooting.	7
Enhancement	Improve method name to better reflect its functionality Consider using a more descriptive method name for `terminate_and_wait_else_kill`. A name like `graceful_shutdown_with_fallback` might better convey the method's behavior. rb/lib/selenium/webdriver/common/child_process.rb [105-116] -def terminate_and_wait_else_kill(timeout) +def graceful_shutdown_with_fallback(timeout) WebDriver.logger.debug("Sending TERM to process: #{@pid}", id: :process) terminate(@pid) poll_for_exit(timeout) WebDriver.logger.debug(" -> stopped #{@pid}", id: :process) rescue TimeoutError, Errno::EINVAL WebDriver.logger.debug(" -> sending KILL to process: #{@pid}", id: :process) kill(@pid) wait WebDriver.logger.debug(" -> killed #{@pid}", id: :process) end Apply this suggestion Suggestion importance[1-10]: 6 Why: The suggestion to rename the method to `graceful_shutdown_with_fallback` improves code readability and better describes the method's functionality, but it is not crucial for functionality.	6

josegomezr added 2 commits August 23, 2024 16:56

[rb] Show the shutdown request in the debug logs

3236c31

codiumai-pr-agent-pro bot added enhancement tests Review effort [1-5]: 3 labels Aug 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[rb] Handle graceful webdriver shutdown #14430

[rb] Handle graceful webdriver shutdown #14430

josegomezr commented Aug 23, 2024 •

edited by codiumai-pr-agent-pro bot

Loading

CLAassistant commented Aug 23, 2024 •

edited

Loading

codiumai-pr-agent-pro bot commented Aug 23, 2024

codiumai-pr-agent-pro bot commented Aug 23, 2024

[rb] Handle graceful webdriver shutdown #14430

Are you sure you want to change the base?

[rb] Handle graceful webdriver shutdown #14430

Conversation

josegomezr commented Aug 23, 2024 • edited by codiumai-pr-agent-pro bot Loading

User description

Description

Motivation and Context

Types of changes

Checklist

Logs of a production application

PR Type

Description

Changes walkthrough 📝

CLAassistant commented Aug 23, 2024 • edited Loading

codiumai-pr-agent-pro bot commented Aug 23, 2024

PR Reviewer Guide 🔍

codiumai-pr-agent-pro bot commented Aug 23, 2024

PR Code Suggestions ✨

josegomezr commented Aug 23, 2024 •

edited by codiumai-pr-agent-pro bot

Loading

CLAassistant commented Aug 23, 2024 •

edited

Loading