Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add sleep period before retrying the ssh connection #10776

Open
rgl opened this issue Apr 4, 2019 · 2 comments · May be fixed by #13047
Open

add sleep period before retrying the ssh connection #10776

rgl opened this issue Apr 4, 2019 · 2 comments · May be fixed by #13047

Comments

@rgl
Copy link
Contributor

rgl commented Apr 4, 2019

while using vagrant 2.2.4 I'm trying to reboot an ubuntu vm with:

config.vm.provision :shell, path: 'reboot.sh'

with reboot.sh being:

nohup bash -c "ps -eo pid,comm | awk '/sshd/{print \$1}' | xargs kill; sync; reboot"

while the vm is rebooting the ssh communicator keeps (re)trying to connect to ssh, but that will fail because the connection is refused by the vm while its booting / not yet ready... and the communicator gives up too quickly.

while following the code, this ended up being because the retry logic of the ssh communicator at

connection = retryable(tries: opts[:retries], on: SSH_RETRY_EXCEPTIONS) do
is not sleeping a bit between retries. this need to be changed to add the sleep argument, e.g.:

connection = retryable(tries: opts[:retries], on: SSH_RETRY_EXCEPTIONS, sleep: timeout) do

maybe that timeout should trickle down from the Vagrantfile provision line (like opts[:retries]), e.g.: with the sleepargument:

config.vm.provision :shell, path: 'reboot.sh', sleep: 120
@nmaludy
Copy link

nmaludy commented Jan 5, 2023

I ran into this problem as well. In my case my guests are RHEL VMs where they are pingable before the SSH daemon starts up. The behavior i see with vagrant up --debug is:

  • The guest becomes pingable
  • An SSH connection is initiated
  • SSH returns CONNECTIONREFUSED because the SSH daemon on the guest isn't up yet (taking a little while to boot)
  • Vagrant does NOT retry and simply exits with an error

I found the same section of code that @rgl did and manually set the opts[:retries] to 5 (seems to be set to 1 when the function is called) and then added in a sleep. This allowed the SSH connection to be retried and communication to the guest works great.

I'm thinking a good solution would be to expose something like:

config.ssh.retries = 5
config.ssh.retry_sleep_interval = 10

These options would allow the user to control the number SSH retries and the sleep time between retries.

Thoughts?

@avoidik
Copy link

avoidik commented Jan 28, 2024

hello all, the very same issue I had reported here is still relevant, if someone is also affected here is a workaround

cd vagrant/embedded/gems/*/gems/vagrant-*/plugins/communicators/ssh
curl -fsSL https://github.com/hashicorp/vagrant/commit/424808388956c0d6acf0e91ca751fe9345f6e7f8.patch -o ssh.patch
patch -p4 -i ssh.patch
rm -f ssh.patch

a diff is connected to #12292

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants