Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support and test "re-imageable" compute nodes via compute node metadata #518

Open
wants to merge 24 commits into
base: main
Choose a base branch
from

Conversation

bertiethorpe
Copy link
Member

@bertiethorpe bertiethorpe commented Jan 6, 2025

Support and test "re-imageable" compute nodes:

  • Add functionality to skeleton TF to allow defining enable_ flags in compute node metadata, as a list of roles/functionalities to enable.
  • Set the enable_* metadata in stackhpc environment only to turn on compute, etc_hosts, nfs, basic_users, eessi. This is all currently supported functionality except for resolv_conf (not used in stackhpc), manila (not available at either leafcloud or sms)
  • Add steps to stackhpc workflow, after sucessful reimage of control/login nodes, to reimage compute nodes (with same image) via ansible adhoc, and to check that nodes come back successfully.
  • Document the supported metadata flags
  • Document the above reimaging workflow

@bertiethorpe bertiethorpe requested a review from a team as a code owner January 6, 2025 10:08
@bertiethorpe bertiethorpe changed the title Cookiecutter TF config to allow defining enable_ flags in compute node metadata Cookiecutter TF config defines enable_ flags in compute node metadata Jan 6, 2025
@bertiethorpe bertiethorpe changed the title Cookiecutter TF config defines enable_ flags in compute node metadata Cookiecutter TF config enable flags in compute node metadata Jan 6, 2025
@bertiethorpe bertiethorpe requested a review from sjpb January 7, 2025 15:04
@sjpb sjpb changed the title Cookiecutter TF config enable flags in compute node metadata Support and test "re-imageable" compute nodes via compute node metadata Jan 8, 2025
Copy link
Collaborator

@sjpb sjpb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to make it so compute_init_enable can be set differently on different compute groups.

Doing git diff HEAD...main on this branch shows the images are out of date with the current main. So it needs to be updated from main.

.github/workflows/stackhpc.yml Outdated Show resolved Hide resolved
environments/.stackhpc/terraform/main.tf Outdated Show resolved Hide resolved
environments/.stackhpc/terraform/compute_init.auto.tfvars Outdated Show resolved Hide resolved
ansible/roles/compute_init/README.md Outdated Show resolved Hide resolved
ansible/roles/compute_init/README.md Outdated Show resolved Hide resolved
ansible/roles/compute_init/README.md Show resolved Hide resolved
.github/workflows/stackhpc.yml Outdated Show resolved Hide resolved
@bertiethorpe

This comment was marked as outdated.

@bertiethorpe
Copy link
Member Author

@bertiethorpe bertiethorpe requested a review from sjpb January 10, 2025 09:22
Copy link
Collaborator

@sjpb sjpb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One query

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants