Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRAYSAT-956: Replace CAPMC calls with PCS calls #154

Merged
merged 2 commits into from
Mar 14, 2024

Conversation

jack-stanek-hpe
Copy link
Contributor

Summary and Scope

Replace calls to the CAPMC API with equivalent calls to the PCS API.

I tried to keep the software API of PCSClient mostly compatible with CAPMCClient, though there may be differences that aren't covered in the tests.

This hasn't been tested on a real system with PCS installed yet.

Issues and Related PRs

Testing

Tested on:

  • Local development environment

Test description:

Run unit tests.

Pull Request Checklist

  • Version number(s) incremented, if applicable
  • Copyrights updated
  • License file intact
  • Target branch correct
  • CHANGELOG.md updated
  • Testing is appropriate and complete, if applicable
  • HPC Product Announcement prepared, if applicable

@jack-stanek-hpe jack-stanek-hpe force-pushed the CRAYSAT-956-replace-capmc-with-pcs branch from 4f709d7 to 07a2bec Compare July 21, 2023 17:34
@haasken-hpe haasken-hpe force-pushed the CRAYSAT-956-replace-capmc-with-pcs branch 2 times, most recently from 1dcda9f to c5dca24 Compare August 7, 2023 22:54
@haasken-hpe haasken-hpe changed the base branch from integration to main August 7, 2023 22:54
@gitguardian
Copy link

gitguardian bot commented Aug 7, 2023

️✅ There are no secrets present in this pull request anymore.

If these secrets were true positive and are still valid, we highly recommend you to revoke them.
Once a secret has been leaked into a git repository, you should consider it compromised, even if it was deleted immediately.
Find here more information about risks.


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

Our GitHub checks need improvements? Share your feedbacks!

Copy link
Contributor

@haasken-hpe haasken-hpe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just leaving some comments based on a code review with @shivaprasad-metimath and team that we did this morning.

sat/apiclient/pcs.py Outdated Show resolved Hide resolved
sat/apiclient/pcs.py Outdated Show resolved Hide resolved
sat/apiclient/pcs.py Outdated Show resolved Hide resolved
@shivaprasad-metimath shivaprasad-metimath force-pushed the CRAYSAT-956-replace-capmc-with-pcs branch 2 times, most recently from 1c85a47 to 54a9b48 Compare September 26, 2023 09:59
@shivaprasad-metimath shivaprasad-metimath marked this pull request as ready for review September 26, 2023 10:21
@shivaprasad-metimath shivaprasad-metimath force-pushed the CRAYSAT-956-replace-capmc-with-pcs branch 2 times, most recently from 618f4a3 to c4e0466 Compare October 24, 2023 15:14
Copy link
Contributor

@haasken-hpe haasken-hpe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't reviewed the whole PR, but I've found some issues in the new PCSClient class that need to be addressed.

sat/apiclient/pcs.py Outdated Show resolved Hide resolved
sat/apiclient/pcs.py Outdated Show resolved Hide resolved
sat/apiclient/pcs.py Outdated Show resolved Hide resolved
sat/apiclient/pcs.py Outdated Show resolved Hide resolved
@shivaprasad-metimath shivaprasad-metimath force-pushed the CRAYSAT-956-replace-capmc-with-pcs branch 6 times, most recently from b700533 to 27c39de Compare February 6, 2024 12:03
Copy link
Contributor

@haasken-hpe haasken-hpe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some functional issues that need to be fixed before you try to test this with sat swap blade.

sat/apiclient/pcs.py Outdated Show resolved Hide resolved
sat/apiclient/pcs.py Outdated Show resolved Hide resolved
sat/apiclient/pcs.py Show resolved Hide resolved
sat/apiclient/pcs.py Outdated Show resolved Hide resolved
sat/apiclient/pcs.py Outdated Show resolved Hide resolved
sat/apiclient/pcs.py Outdated Show resolved Hide resolved
sat/apiclient/pcs.py Outdated Show resolved Hide resolved
@shivaprasad-metimath shivaprasad-metimath force-pushed the CRAYSAT-956-replace-capmc-with-pcs branch 2 times, most recently from e4af5af to 6a424c5 Compare March 7, 2024 18:39
@shivaprasad-metimath shivaprasad-metimath force-pushed the CRAYSAT-956-replace-capmc-with-pcs branch 2 times, most recently from 11340c3 to 9be08a5 Compare March 11, 2024 04:42
@shivaprasad-metimath
Copy link
Contributor

Test output:
sat_swap_validation.txt

  • sat swap performed on Baldar
  • tested on xname=x8000c0s0

Observations:

During enable of blade, time out was observed;
Waiting for condition "HMS Discovery Scheduled" timed out after 109 seconds
WARNING: Waiting for "cronjob hms-discovery in namespace services" timed out, recreating cronjob.
Waiting for condition "HMS Discovery Scheduled" timed out after 109 seconds
ERROR: Could not add blade: Timed out waiting for hms-discovery cron job to resume

During the power on of the nodes, timeout was observed
INFO: Waiting for BOS session 68fb2303-1fa6-4aa6-9b06-0426140824d8 to reach target state complete. Session template: compute-23.11.0-beta.6.x86_64shs-b2.2.0-6
INFO: Waiting for BOS session 68fb2303-1fa6-4aa6-9b06-0426140824d8 to reach target state complete. Session template: compute-23.11.0-beta.6.x86_64shs-b2.2.0-6
ERROR: BOS boot timed out after 900 seconds for session template: compute-23.11.0-beta.6.x86_64shs-b2.2.0-6.
INFO: Session 68fb2303-1fa6-4aa6-9b06-0426140824d8: 0% components succeeded, 0% components failed
ERROR: Boot failed or timed out for session template: compute-23.11.0-beta.6.x86_64shs-b2.2.0-6

Copy link
Contributor

@haasken-hpe haasken-hpe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor issue was discovered with PCSError class. This issue has been present since the original PR was raised. Shiva and I just noticed in while doing a code review on a meeting. It should be easy to address.

sat/apiclient/pcs.py Outdated Show resolved Hide resolved
sat/apiclient/pcs.py Outdated Show resolved Hide resolved
@haasken-hpe
Copy link
Contributor

Please do file a CRAYSAT Jira for us to look at this problem you noted as well:

During enable of blade, time out was observed;

Waiting for condition "HMS Discovery Scheduled" timed out after 109 seconds
WARNING: Waiting for "cronjob hms-discovery in namespace services" timed out, recreating cronjob.
Waiting for condition "HMS Discovery Scheduled" timed out after 109 seconds
ERROR: Could not add blade: Timed out waiting for hms-discovery cron job to resume

And please mark it related to CRAYSAT-1623. I was expecting that Jira to have resolved this issue.

@shivaprasad-metimath
Copy link
Contributor

Please do file a CRAYSAT Jira for us to look at this problem you noted as well:

During enable of blade, time out was observed;

Waiting for condition "HMS Discovery Scheduled" timed out after 109 seconds
WARNING: Waiting for "cronjob hms-discovery in namespace services" timed out, recreating cronjob.
Waiting for condition "HMS Discovery Scheduled" timed out after 109 seconds
ERROR: Could not add blade: Timed out waiting for hms-discovery cron job to resume

And please mark it related to CRAYSAT-1623. I was expecting that Jira to have resolved this issue.

sure Ryan, will have new one created and add a ref to the old ticket

@prasanthkurian
Copy link

Please do file a CRAYSAT Jira for us to look at this problem you noted as well:

During enable of blade, time out was observed;

Waiting for condition "HMS Discovery Scheduled" timed out after 109 seconds
WARNING: Waiting for "cronjob hms-discovery in namespace services" timed out, recreating cronjob.
Waiting for condition "HMS Discovery Scheduled" timed out after 109 seconds
ERROR: Could not add blade: Timed out waiting for hms-discovery cron job to resume

And please mark it related to CRAYSAT-1623. I was expecting that Jira to have resolved this issue.

sure Ryan, will have new one created and add a ref to the old ticket

CRAYSAT-1818 is filed by Harold recently. Can we use this jira instead of filing new one.

@shivaprasad-metimath
Copy link
Contributor

Please do file a CRAYSAT Jira for us to look at this problem you noted as well:

During enable of blade, time out was observed;

Waiting for condition "HMS Discovery Scheduled" timed out after 109 seconds
WARNING: Waiting for "cronjob hms-discovery in namespace services" timed out, recreating cronjob.
Waiting for condition "HMS Discovery Scheduled" timed out after 109 seconds
ERROR: Could not add blade: Timed out waiting for hms-discovery cron job to resume

And please mark it related to CRAYSAT-1623. I was expecting that Jira to have resolved this issue.

sure Ryan, will have new one created and add a ref to the old ticket

CRAYSAT-1818 is filed by Harold recently. Can we use this jira instead of filing new one.

Let me take a look at it Prasanth, it is same error reported, seems like a documentation changes are added

@shivaprasad-metimath shivaprasad-metimath force-pushed the CRAYSAT-956-replace-capmc-with-pcs branch 11 times, most recently from 55f5984 to 6abc097 Compare March 14, 2024 09:52
This commit implements a basic PCSClient class which is mostly
compatible with the CAPMCClient class. Usage of the CAPMCClient class
was replaced with the new PCSClient class.
Prereq, xname_errs removal
removing the unused func: do_nodes_power_off
and its testcases
@haasken-hpe haasken-hpe merged commit d39c336 into main Mar 14, 2024
3 checks passed
@haasken-hpe haasken-hpe deleted the CRAYSAT-956-replace-capmc-with-pcs branch March 14, 2024 13:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants