Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PC: Fixup ssh host key validation #20274

Merged
merged 1 commit into from
Oct 9, 2024

Conversation

pdostal
Copy link
Member

@pdostal pdostal commented Sep 25, 2024

This continues the SSH host key validation journey:

I propose PUBLIC_CLOUD_SSH_CONFIG variable so different test suites can have different SSH config. For QE-C we always validate the SSH host key but SAP not yet. The SSH host key needs to be fetched in those situations:

  • When instance is created (obviously)
  • When instance reboots and it's public IP address changes
  • When cloud-init clean is performed
  • Related ticket: poo#166799
  • Verification runs: In the discussion

Copy link

github-actions bot commented Sep 25, 2024

Great PR! Please pay attention to the following items before merging:

Files matching lib/**.pm:

  • Consider adding or extending unit tests in t/

Files matching lib/publiccloud/**.pm:

  • Provide VRs for both QE-C as well as QE-SAP (check Confluence for more info)

This is an automatically generated QA checklist based on modified files.

@pdostal pdostal added the WIP Work in progress label Sep 25, 2024
lib/publiccloud/instance.pm Outdated Show resolved Hide resolved
Copy link
Contributor

@alvarocarvajald alvarocarvajald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes could impact HanaSR tests which add a -F none to avoid using .ssh/config. The overriding of ssh_opts internal to the function could break the tests.

lib/publiccloud/instance.pm Outdated Show resolved Hide resolved
lib/publiccloud/instance.pm Outdated Show resolved Hide resolved
@alvarocarvajald
Copy link
Contributor

@lilyeyes
Copy link
Contributor

Please help to do a VR for mr_test such as: https://openqa.suse.de/tests/15506454 (It invokes wait_for_ssh() + softreboot() )

@alvarocarvajald
Copy link
Contributor

Added some HanaSR VRs to confirm:

https://openqaworker15.qa.suse.cz/tests/overview?build=VR4PR20274&distri=sle&version=15-SP5&groupid=41

HanaSR VRs passed.

Cloned the saptune test (the one mentioned by @lilyeyes ) in https://openqaworker15.qa.suse.cz/tests/298412#live

@pdostal pdostal force-pushed the ssh_key_checking branch 2 times, most recently from 78732db to 5854994 Compare September 27, 2024 08:36
@pdostal
Copy link
Member Author

pdostal commented Sep 27, 2024

@alvarocarvajald @lilyeyes I miss those two variables:

#       _SECRET_AZURE_SPN_APPLICATION_ID - application ID for fencing agent
#       _SECRET_AZURE_SPN_APP_PASSWORD - application password used by fencing agent

@alvarocarvajald
Copy link
Contributor

@alvarocarvajald
Copy link
Contributor

@alvarocarvajald @lilyeyes I miss those two variables:

#       _SECRET_AZURE_SPN_APPLICATION_ID - application ID for fencing agent
#       _SECRET_AZURE_SPN_APP_PASSWORD - application password used by fencing agent

They are in openqa/workerconf.sls in salt-pillars-openqa internal repository.

Copy link
Contributor

@alvarocarvajald alvarocarvajald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please give us time to add the new setting (PUBLIC_CLOUD_SSH_CONFIG) into the jobf before merging.

lib/publiccloud/provider.pm Outdated Show resolved Hide resolved
@@ -115,7 +115,7 @@ sub run_cmd {
delete($args{timeout});
delete($args{runas});

$self->{my_instance}->wait_for_ssh(timeout => $timeout);
$self->{my_instance}->wait_for_ssh(timeout => $timeout, scan_ssh_host_key => 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea to put scan_ssh_host_key => 1 not to perturbate SAP tests? To let SAP test works exactly like before?

Could you explain what we should check to decide if we can omit scan_ssh_host_key here too ?

'-o', 'LogLevel=DEBUG3',
'-o', 'PasswordAuthentication=no',
#'-o', 'PasswordAuthentication=no',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand why exactly but only these two are commented. Why not all of them or none? Is this PR specifically about PasswordAuthentication . -F is

       -F configfile
               Specifies  an  alternative  per-user  configuration file.  If a configuration file is given on the command line, the system-wide configuration file (/etc/ssh/ssh_config) will be ignored.  The default for the per-user configuration file is ~/.ssh/config.  If set to “none”, no  configuration files will be read.

Is the idea here to force SAP tests to to use a ssh config file? Why is that? What we miss not doing that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You use Ansible. Ansible does care about the SSH config so without ssh config it's hard to make Ansible to f.e. not validate the ssh host key.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got rid of this as it's now in the config file.

@pdostal pdostal force-pushed the ssh_key_checking branch 6 times, most recently from 6b88be9 to 8addd30 Compare October 4, 2024 09:14
@alvarocarvajald
Copy link
Contributor

LGTM. Please give us time to add the new setting (PUBLIC_CLOUD_SSH_CONFIG) into the jobf before merging.

@pdostal we were discussing in QE-SAP where to add the extra settings (PUBLIC_CLOUD_SSH_CONFIG: 'publiccloud/ssh_config_sap'), and we think the best place would be in schedule/sles4sap/publiccloud_hanasr.yaml here in os-autoinst-distri-opensuse.

Do you think you can add it as part of this PR? And schedule a VR without setting PUBLIC_CLOUD_SSH_CONFIG to confirm it works? With that, I think we can merge this from QE-SAP's side.

@pdostal
Copy link
Member Author

pdostal commented Oct 7, 2024

@alvarocarvajald I've modified schedule/sles4sap/publiccloud_hanasr.yml but the sles4sap_gnome_saptune_notes is failing.

@pdostal
Copy link
Member Author

pdostal commented Oct 7, 2024

I'm running into this issue: https://pdostal-server.suse.cz/tests/7680#step/Verify_azure_fence_agent_MSI/91
Not sure if it is my fault, or the test is unstable or what is going on.

@alvarocarvajald
Copy link
Contributor

alvarocarvajald commented Oct 7, 2024

@alvarocarvajald I've modified schedule/sles4sap/publiccloud_hanasr.yml but the sles4sap_gnome_saptune_notes is failing.

saptune jobs should not use schedule/sles4sap/publiccloud_hanasr.yml, but different schedules (for example schedule/sles4sap/sles4sap_gnome_saptune.yaml or schedule/sles4sap/sles4sap_gnome_saptune_maintenance.yaml)

I'm running into this issue: https://pdostal-server.suse.cz/tests/7680#step/Verify_azure_fence_agent_MSI/91
Not sure if it is my fault, or the test is unstable or what is going on.

Hmm. This is a HanaSR test, so schedule is OK. Failure seems to come from this output: https://pdostal-server.suse.cz/tests/7680#step/Verify_azure_fence_agent_MSI/89

It's actually expecting only the list of nodes, as in: https://pdostal-server.suse.cz/tests/7616#step/Verify_azure_fence_agent_MSI/89

So something changed in between these 2 runs and now the ssh command must be missing something like -E /var/tmp/ssh_sut.log.

@pdostal pdostal force-pushed the ssh_key_checking branch 3 times, most recently from 56d1dde to 5330ce3 Compare October 8, 2024 13:32
Copy link
Member

@asmorodskyi asmorodskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM to latest version

@asmorodskyi
Copy link
Member

but I think it is necessary also to get LGTM's from Michele and/or Alvaro

@pdostal
Copy link
Member Author

pdostal commented Oct 8, 2024

Thank you Alvaro. Now all verification runs on pdostal-server.suse.cz are green.
You may merge it or wait for me. See you in two weeks after my vacation 🌎

Copy link
Contributor

@alvarocarvajald alvarocarvajald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -8,6 +8,7 @@ vars:
BOOT_HDD_IMAGE: '1'
NODE_COUNT: '1'
TEST_CONTEXT: 'OpenQA::Test::RunArgs'
PUBLIC_CLOUD_SSH_CONFIG: 'publiccloud/ssh_config_sap'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh. I thought since saptune tests are using the SSH terminal connection/tunnel thingy, that this was not necessary here, but if it works, awesome!

@asmorodskyi asmorodskyi merged commit b141881 into os-autoinst:master Oct 9, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants