Skip to content

Commit

Permalink
Delay the crash system command
Browse files Browse the repository at this point in the history
One of the HanaSR test is about crashing one cluster node running HANA.
Crash command is executed through a ssh channel. Problem is that, as soon as the
system crash, the ssh connection  is interrupted leaving the ssh client blocked.
The idea is: compose the remotely executed command with a sleep and then the crash,
run these two in background. It gives time to the ssh client to close the
session before the crash happening.
Remove the timepout=0 behavior, stop forwarding to run_ssh_command all
the args content.
  • Loading branch information
mpagot committed Sep 13, 2024
1 parent 16fcc66 commit 2286c57
Showing 1 changed file with 40 additions and 12 deletions.
52 changes: 40 additions & 12 deletions lib/sles4sap_publiccloud.pm
Original file line number Diff line number Diff line change
Expand Up @@ -354,9 +354,14 @@ sub stop_hana {
$args{method} //= 'stop';
my $timeout = bmwqemu::scale_timeout($args{timeout} // 300);
my %commands = (
stop => "HDB stop",
kill => "HDB kill -x",
crash => "echo b > /proc/sysrq-trigger &"
stop => 'HDB stop',
kill => 'HDB kill -x',
# -b is for running the command in background
# echo b > /proc/sysrq-trigger is for crashing the remote node
# sleep 5 is to give time sudo to put the command execution in background and
# to ssh to return, both before to trigger the crash
# This also work in conjunction with ssh -fn arguments
crash => 'sudo -b sh -c "sleep 5; echo b > /proc/sysrq-trigger"'
);
croak("HANA stop method '$args{method}' unknown.") unless $commands{$args{method}};

Expand All @@ -368,17 +373,40 @@ sub stop_hana {
record_info("Stopping HANA", "CMD:$cmd");
if ($args{method} eq "crash") {
# Crash needs to be executed as root and wait for host reboot

# Ensure the remote node is in a normal state before to trigger the crash
$self->{my_instance}->wait_for_ssh(timeout => $timeout);
$self->{my_instance}->run_ssh_command(cmd => "sudo su -c sync", timeout => "0", %args);
# Try only extending ssh_opts
my $ssh_opts = $self->{my_instance}->ssh_opts . ' -o ServerAliveInterval=2';
$self->{my_instance}->run_ssh_command(cmd => 'sudo su -c "' . $cmd . '"',
timeout => "0",
ssh_opts => $ssh_opts,
%args);
# Send a Ctrl-C to unblock the terminal session if no prompt is seen in 30 seconds

$self->{my_instance}->run_ssh_command(cmd => "sudo su -c sync", timeout => $timeout);

# Create a local instance of ssh_opts only for the crash command
# it is about extending options defined in sles4sap_publiccloud_basetest::set_cli_ssh_opts
# -f requests ssh to go to background just before command execution
# -n is about stdin redirection and it is needed by -f to work
my $crash_ssh_opts = $self->{my_instance}->ssh_opts . ' -fn -o ServerAliveInterval=2';

$self->{my_instance}->run_ssh_command(
cmd => $cmd,
# This timeout is to ensure the run_ssh_command is executed in a reasonable amount of time.
# It is not about how much time the crash is expected to take to be executed remotely,
# as that one is configured to be executed in background.
# So, in theory, run_ssh_command returns immediately, and 10 has nothing to do with the value of sleep 5 executed remotely
# Also consider that internally run_ssh_command is using this value for two different guard mechanism.
timeout => 10,
ssh_opts => $crash_ssh_opts);


# crash trigger command:
# - is executed in background
# - has sleep 5 executed remotely.
# run_ssh_command return immediately, so before the remote system execute the crash command.
# So the test execution has to sleep now, waiting that remote system has time to execute the crash procedure.
sleep 10;

# Send a Ctrl-C to unblock the terminal session if no prompt is seen here
type_string('', terminate_with => 'ETX') unless (wait_serial(serial_term_prompt()));
# It is better to wait till ssh disappear

# Wait till ssh disappear
record_info("Wait ssh disappear start");
my $out = $self->{my_instance}->wait_for_ssh(timeout => 60, wait_stop => 1);
record_info("Wait ssh disappear end", "out:" . ($out // 'undefined'));
Expand Down

0 comments on commit 2286c57

Please sign in to comment.