Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CFE-4322: Changed cf-remote spawn with AWS to query for AMI from known owners #85

Merged
merged 1 commit into from
Apr 2, 2024

Conversation

craigcomstock
Copy link
Contributor

Before we hard-coded AMI and had to update them fairly often and add them when we supported a new platform.

Here we change to querying for AMI based on known trusted owner IDs.

This should make it so that new platforms are automatically available and any updates will be in-place automatically as well.

Ticket: CFE-4322
Changelog: title

@craigcomstock craigcomstock marked this pull request as draft April 1, 2024 17:28
print("Available platforms:")
for key in sorted(cloud_data.aws_platforms.keys()):
print("Available platforms (image criteria):")
for key in sorted(cloud_data.aws_image_criteria.keys()):
Copy link
Contributor Author

@craigcomstock craigcomstock Apr 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

list platforms isn't quite descriptive anymore as we don't have an explicit list for AWS (I don't know anything (yet) about GCP).

new output of cf-remote spawn --list-platforms

Available platforms (image criteria):
centos
debian
debian-9
macos
rhel
suse
ubuntu
ubuntu-16
windows
windows-2008
windows-2012
windows-2016
windows-2019

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Old output:

$ cf-remote spawn --list-platforms
Available platforms:
centos-5-x32
centos-5-x64
centos-6-x64
centos-7-x64
debian-10-x64
debian-11-arm64
debian-11-x64
debian-12-x64
debian-4-x32
debian-4-x64
debian-5-x32
debian-5-x64
debian-6-x32
debian-6-x64
debian-7-x32
debian-7-x64
debian-8-x64
debian-9-x64
rhel-5-x64
rhel-6-x64
rhel-7-x64
rhel-8-x64
rhel-9-x64
ubuntu-12-04-x32
ubuntu-12-04-x64
ubuntu-14-04-x32
ubuntu-14-04-x64
ubuntu-16-04-x64
ubuntu-18-04-x64
ubuntu-20-04-x64
ubuntu-22-04-arm64
ubuntu-22-04-x64
windows-2012-x64
windows-2016-x64
windows-2019-x64

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I tested several of the older platforms and pretty sure their AMIs are dead. But I will double-check those today.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adjusted slightly for more information since we don't explicitly have a list anymore to describe the format of the "platform" argument.

Platform images are queried based on the platform name, version and architecture.
The form of platform specified is: <platform_name>[-<version>][-<architecture>]. e.g. debian, debian-12 or debian-12-x64
Ubuntu version can be just major (20) or major+minor (20-04)
Architecture can either be x64 or arm64

Available platforms:
centos
debian
debian-9
rhel
suse
ubuntu
ubuntu-16
windows
windows-2008
windows-2012
windows-2016
windows-2019

@craigcomstock
Copy link
Contributor Author

craigcomstock commented Apr 1, 2024

working on some nice debug logging from my development print statements:

[INFO] Set the log level to DEBUG
Spawning VMs...[DEBUG] Looking for AWS AMI for platform_name 'debian'
[DEBUG] Parsed platform_version '' from platform_name 'debian'
[DEBUG] Determined architecture to be 'x86_64'
[DEBUG] Determined image criteria: {'owner_id': '136693071363', 'name_pattern': 'debian-{version}*', 'user': 'admin', 'architecture': 'x86_64', 'version': ''}
[DEBUG] Selected image <NodeImage: id=ami-0fa1b1734dd573e80, name=debian-11-backports-amd64-20240211-1654, driver=Amazon EC2  ...>
Spawning new platform 'debian' VM in AWS (AMI: ami-0fa1b1734dd573e80, size=t2.xlarge) False
.DONE
[DEBUG] Directory already exists: '/Users/craig/.cfengine/cf-remote'
Details about the spawned VMs can be found in /Users/craig/.cfengine/cf-remote/cloud_state.json
Destroying all hosts

@craigcomstock craigcomstock marked this pull request as ready for review April 1, 2024 19:21
@craigcomstock
Copy link
Contributor Author

Need to clean up commits and add some more docs/changes notices like:

  • with this change you can configure a different region other than eu-west-1 and cf-remote will most likely "just work", only tested in us-east-2
  • some platforms, like windows, are locked down to eu-west-1 currently due to our custom images we use with SSH server installed/configured

@craigcomstock
Copy link
Contributor Author

An example of "non default region" just working with windows:

nothing: cf-remote --log-level=debug spawn --count 1 --platform windows --role client --name test
[INFO] Set the log level to DEBUG
Spawning VMs...[DEBUG] Looking for AWS AMI for platform_name 'windows'
[DEBUG] Parsed platform_version '*' from platform_name 'windows'
[DEBUG] Determined architecture to be 'x86_64'
[DEBUG] Determined image criteria: {'meta': 'Note that typically we rely on custom pre-configured windows imimages with ssh installed and pre-populated public keys so an image spawned from this criteria will not come with ssh built-in and ready to go.', 'owner_id': '801119661308', 'name_pattern': 'Windows_Server-{version}-English-Core-Base*', 'user': 'Administrator', 'architecture': 'x86_64', 'version': '*'}
[DEBUG] Selected image <NodeImage: id=ami-0a43a3af91e0c8d41, name=Windows_Server-2022-English-Core-Base-2024.03.13, driver=Amazon EC2  ...>
Spawning new platform 'windows' VM in AWS (AMI: ami-0a43a3af91e0c8d41, size=t2.micro) False
.DONE

Copy link
Member

@nickanderson nickanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking forward to it. I probably shouldn't be the one to judge the code though.

@craigcomstock
Copy link
Contributor Author

commits squashed, fully ready for review :)

@craigcomstock
Copy link
Contributor Author

Another thing this change is nice for while we still use explicit AMIs elsewhere is to find the latest AMI for a platform:

cf-remote spawn --platform ubuntu-20-04-x64 --count 1 --role client --name test
Spawning VMs...Spawning new platform 'ubuntu-20-04-x64' VM in AWS (AMI: ami-005c1d2266900dc31, size=t2.micro)
.DONE
Waiting for VMs to get IP addresses...DONE
Details about the spawned VMs can be found in /Users/craig/.cfengine/cf-remote/cloud_state.json

Copy link
Contributor

@larsewi larsewi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be a nice improvement 🚀

cf_remote/spawn.py Outdated Show resolved Hide resolved
cf_remote/spawn.py Outdated Show resolved Hide resolved
cf_remote/spawn.py Outdated Show resolved Hide resolved
cf_remote/spawn.py Outdated Show resolved Hide resolved
cf_remote/spawn.py Outdated Show resolved Hide resolved
cf_remote/spawn.py Outdated Show resolved Hide resolved
tests/aws-spawn-test.sh Show resolved Hide resolved
@nickanderson
Copy link
Member

@craigcomstock probably demo.sh in core should be updated with the release of this change: https://github.com/cfengine/core/blob/f47517a53ec5e84bfabb28761233453798b93b1b/docs/cf_remote_demo.sh

tests/test_spawn.py Outdated Show resolved Hide resolved
Before we hard-coded AMI and had to update them fairly often and add them when we supported a new platform.

Here we change to querying for AMI based on known trusted owner IDs.

This should make it so that new platforms are automatically available and any updates will be in-place automatically as well.

This change should also enable users to specify other regions and have most tplatforms "just work".

One note: specific Windows AMI are only available in eu-west-1

Ticket: CFE-4322
Changelog: title

Co-authored-by: Lars Erik Wik <[email protected]>
@craigcomstock
Copy link
Contributor Author

@craigcomstock probably demo.sh in core should be updated with the release of this change: https://github.com/cfengine/core/blob/f47517a53ec5e84bfabb28761233453798b93b1b/docs/cf_remote_demo.sh

spawning worked fine for me but... centos-7 ssh didn't work so will check that out separately, shouldn't be an issue with cf-remote I don't think.

@craigcomstock
Copy link
Contributor Author

@craigcomstock probably demo.sh in core should be updated with the release of this change: https://github.com/cfengine/core/blob/f47517a53ec5e84bfabb28761233453798b93b1b/docs/cf_remote_demo.sh

spawning worked fine for me but... centos-7 ssh didn't work so will check that out separately, shouldn't be an issue with cf-remote I don't think.

Yep, a problem on my end, probably keys. I rotated my keys in AWS and all was fine. centos-7 did take some TIME to be ready. 🤔

@craigcomstock craigcomstock merged commit 734973d into cfengine:master Apr 2, 2024
5 checks passed
@craigcomstock craigcomstock deleted the CFE-4322 branch April 2, 2024 16:27
Copy link
Contributor

@vpodzime vpodzime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Late to the party, but still a couple code-quality comments.

platform_version = ""
else:
platform_version = (
platform_name.count("-") > 0 and platform_name.split("-")[1] or "*"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer using value1 if condition else value2 here instead of relying on logical operators.

# image for that platform and version.
def _get_image_criteria(platform_name):
log.debug("Looking for AWS AMI for platform_name '%s'" % (platform_name))
platform = platform_name.split("-")[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add an example platform_name string here. Otherwise the code below is quite confusing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, looks like platform_name should be .split("-") and then the individual components should be "named" (assigned to properly named variables).

platform_with_major_version = "-".join(platform_name.split("-")[0:2])
architecture = platform_name.split("-")[-1]
# architecture should be either x64 or arm64
if not (architecture == "x64" or architecture == "arm64"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

architecture not in ("x64", "arm64")

ex_filters={
"name": criteria["name_pattern"].format(version=criteria["version"]),
"architecture": criteria["architecture"],
"virtualization-type": "hvm",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the virtualization-type really needed? Why should we care?

raise ValueError("No images found for criteria: %s" % (criteria))
selected = sorted(candidates, key=lambda x: x.extra["creation_date"], reverse=True)[
0
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WTH is this formatting about?

},
criteria = _get_image_criteria(platform)
architecture = criteria["architecture"] or aws_defaults["architecture"]
sizes = criteria.get("sizes") or aws_defaults["sizes"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.get("sizes", aws_defaults["sizes"]). Or could the former be and empty string/list/...?

sizes = criteria.get("sizes") or aws_defaults["sizes"]
small = sizes[architecture]["size"]
large = sizes[architecture]["xlsize"]
if size == None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size is None

criteria = _get_image_criteria(platform)
architecture = criteria["architecture"] or aws_defaults["architecture"]
sizes = criteria.get("sizes") or aws_defaults["sizes"]
small = sizes[architecture]["size"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.get("size")?

small = sizes[architecture]["size"]
large = sizes[architecture]["xlsize"]
if size == None:
size = (large or small) if (role == "hub") else (small or large)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! 😁 👍

trap cleanup EXIT

# this is a fairly exhaustive test and will take some time
# spawn all reasonable "platform" specifications
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And it's also not very cheap. 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants