-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
api: allow use of UUIDs as component IDs in instance specs #816
Conversation
@leftwo You may want to take a look at the Crucible bits of this one (in server.rs, initializer.rs, and state_driver.rs). Instead of calling
I reckon Omicron will just use the disk ID as the Crucible backend ID to make this as smooth as possible in Nexus and sled agent. As far as I can tell, the disk ID is the ID that we were getting from LMK whether this seems like a reasonable approach. |
This seems reasonable to me overall, but I'll defer to the Crucible folks |
// If metrics are enabled and this Crucible backend was | ||
// identified with a UUID-format spec key, register this disk | ||
// for metrics collection, using the key as the disk ID. | ||
if let (Some(registry), SpecKey::Uuid(disk_id)) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nexus will never tell propolis-server
about a Crucible disk with non-UUID ID, right? if so, the SpecKey::Name
path here, where there just won't be metrics collected for the disk, shouldn't be reachable except if we've crafted a weird instance request by hand? if this shouldn't ever be reachable, it seems worth rejecting the spec /more precisely typing IDs we know need to be UUID.
i do remember disk names ending up in here somewhere, rather than IDs.. gonna go reread how that plumbing works unless you remember more immediately.
my real concern here is that this is where Nexus had been giving what are now SpecKey::Name
, and that we'll lose Crucible metrics here. do you know if we have a test that exercises that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nexus will never tell propolis-server about a Crucible disk with non-UUID ID, right? if so, the SpecKey::Name path here, where there just won't be metrics collected for the disk, shouldn't be reachable except if we've crafted a weird instance request by hand? if this shouldn't ever be reachable, it seems worth rejecting the spec /more precisely typing IDs we know need to be UUID.
This makes sense. I'm a little leery of it because I think it makes the API a little weird: the ID for a given component can be either a UUID a Name, unless it's this one specific kind of component, in which case it has to be a UUID. This isn't horribly broken or anything, it's just a slightly sharp edge (and we'd also have to account for it in the non-Omicron tools/processes that let users attach ad-hoc Crucible volumes to Propolis servers).
We could address this in a few other ways:
warn!
if the spec key isn't a UUID but there's a producer registry available- put a
#[cfg(feature = "omicron-build")]
guard in that enforces the "Crucible keys must be UUIDs" requirement - go back to calling
Volume::get_uuid
in this path to get the UUID to use for stats reporting, andwarn!
if it differs from the spec key
I kind of like option 3 now that I've written it up: it limits our regression risk (at least where metrics are concerned) but keeps the API flexible. WDYT?
i do remember disk names ending up in here somewhere, rather than IDs.. gonna go reread how that plumbing works unless you remember more immediately.
IIRC the disk name wasn't used in this specific path (EDIT: this is incorrect, see below; the disk name was used as a spec key previously, which then used it to construct serial numbers for NVMe devices), but it was used in VCR replacement, which worked like this:
- Take the VM objects lock
- Look up the
CrucibleStorageBackend
component in the instance spec- the key for this was derived from the disk name
- Look up the Crucible backend object in the VM's
crucible_backends
map- the key for this turns out to be the disk ID; the initialization code path used to read it from the Crucible backend's
get_uuid
function, which would read it out of the disk's volume construction request; Omicron guaranteed that this ID matched the disk ID (by creating the VCR that way)
- the key for this turns out to be the disk ID; the initialization code path used to read it from the Crucible backend's
- send the replace operation to Crucible
- on success, write the updated VCR back to the instance spec
my real concern here is that this is where Nexus had been giving what are now SpecKey::Name, and that we'll lose Crucible metrics here. do you know if we have a test that exercises that?
We don't have an end-to-end test for disk metrics, at least AFAIK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, after re-reading a bit and taking a second look, i'm also a fan of Volume::get_uuid
here. i generally like the change to make crucible
just Option<Arc<...>>
though, so to double-check: you're thinking literally calling crucible.get_uuid()
somewhere around this if
, yeah? i think that should be pretty clean.
FWIW the get_uuid()
call before was context'd and ?
, so it would bubble an error up rather than warn. that still seems like an appropriate error behavior here too.
more UUID tracing i haven't done: could backend_id
be different from crucible.get_uuid()
? i assume this is "technically possible but not really desired"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be fixed in d6602c1.
more UUID tracing i haven't done: could backend_id be different from crucible.get_uuid()? i assume this is "technically possible but not really desired"
That's correct, at least as I understand it:
- the serialized
VolumeConstructionRequest
that gets passed to the API is expected to be aVolumeConstructionRequest::Volume
(which will have subvolumes of typeRegion
) Volume::get_uuid
returns the ID field in that top-levelVolume
- when Nexus constructs the VCR for a new disk, it sets that ID to the disk ID
When I update Omicron to pick up this change, I expect that Nexus/sled-agent will use disk IDs as the spec keys for any CrucibleStorageBackend
s they specify. That will preserve this relationship so long as Nexus keeps inserting the disk ID into its VolumeConstructionRequest::Volume
s the way it does today.
91023e4
to
d8967c7
Compare
de183a9
to
bafad9f
Compare
// TODO(#790): Propolis's NVMe controller used to pad | ||
// serial numbers with 0 bytes by default. This causes | ||
// disk serial numbers to appear in the guest to be | ||
// null-terminated strings. This, in turn, affects (or | ||
// at least may affect) how guest firmware images | ||
// identify disks when determining a VM's boot order: a | ||
// disk whose serial number is padded to 20 bytes with | ||
// spaces may not match a disk whose serial number was | ||
// padded with \0 bytes. | ||
// | ||
// Unfortunately for us, guest firmware may persist disk | ||
// identifiers to a VM's nonvolatile EFI variable store. | ||
// This means that changing from zero-padded to | ||
// space-padded serial numbers may break an existing | ||
// VM's saved boot order. | ||
// | ||
// Until we decide whether and how to preserve | ||
// compatibility for existing VMs that may have | ||
// preserved a zero-padded disk ID, continue to zero-pad | ||
// disk IDs in PHD to match previous behavior. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i initially wondered if you expected to copy this comment to Omicron later, but the corresponding Omicron change will probably be a migration to persist the null-padded serials and switch to space-padded for new disks huh? then at least any old wrong data will be made static, rather than "fixed" without us expecting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cleaned up in d6dd7c7. The main thing I wanted to emphasize in this comment is that PHD behaves this way because Omicron will when I create the initial changes to sled-agent to get it to use the spec-based ensure API), and I don't want their behaviors to drift too far apart.
the corresponding Omicron change will probably be a migration to persist the null-padded serials and switch to space-padded for new disks huh?
This is definitely the most straightforward approach, but I don't really like it, because it entails having Nexus have to keep track of the (hopefully small-and-getting-smaller) set of existing disks that need the special "pad with zeroes not spaces" compat quirk. I would be willing to consider some possibly-outlandish options (e.g. trying to do something clever in the boot order management logic in our EDK2 fork) to avoid adding a quirk column/table just for this.
It probably goes without saying, but I intend not to try to solve that problem in the Omicron change that will ingest these Propolis changes--the first order of business is to convert to the new API without actually changing or breaking anything :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for working through this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any obvious problems with crucible and this change. It would be good to take it for a test drive and replace a downstairs just to be sure we connected the dots correctly.
Thanks @leftwo for taking a look! Before I merge I'll make sure to spin up a Propolis with this change and recheck that I can properly dispatch operations to the snapshot and VCR replacement APIs. (We'll want to check that that works on the Omicron side as well, but I'll ask about how to do that when the Omicron changes are ready.) |
a92ffd7
to
7a9bbc0
Compare
Instead of using Strings as keys in instance specs, define a SpecKey type that is a union of a UUID and a String. This type serializes as a string, and its `FromStr` and `From<String>` impls will try to parse strings as UUIDs before storing keys in string format. This allows the control plane to use UUIDs wherever it makes sense while allowing it (and other users) to continue to use strings where a UUID component ID is unavailable or is being used by some other spec element (as might be the case for a disk device and disk backend). Index most objects in propolis-server using SpecKeys instead of Strings. This is mostly just a rote change-the-types, make-it-compile exercise, but it extends to Crucible backends, which deserve some additional commentary. In the old ensure API, sending a `DiskRequest` would register disk components with several different identifiers: - The disk's name, given by the `DiskRequest`'s `name` field, is used - as a key in the VM's `DeviceMap`, - as the disk device component's key in the VM's instance spec, and - to generate the disk backend component's key in the VM's instance spec. - The disk's ID, given as the `id` in the `Volume`-variant `VolumeConstructionRequest` in the `DiskRequest`, is reported by the Crucible backend's `get_uuid` function and used - to register a metrics producer for each disk, and - as the key in the VM's `CrucibleBackendMap`. Now that all new VMs are started using instance specs, use component keys for everything: - Entities in the `DeviceMap` are identified by their corresponding components' `SpecKey`s. - Crucible backends in the `CrucibleBackendMap` are also identified by their `SpecKey`s. - APIs that take a Crucible backend ID on their path now take a String and not a UUID. The server converts these to a SpecKey before looking up backends in the map. - When a Crucible backend is created, the machine initializer checks to see if its `SpecKey` was a `Uuid`. If so, it will register the backend to produce metrics. The intention is that Omicron will use disk IDs as the keys for its `CrucibleStorageBackend` components in the specs it generates. It can then also use these IDs as parameters when requesting snapshots or VCR replacements. The friendly names users give their Omicron disks no longer appear in the Propolis API at all.
d6dd7c7
to
9a291d3
Compare
I've checked this for both of these endpoints: the snapshot and replace operations indeed get directed to the relevant Crucible downstairs processes (which then summarily reject them because I didn't build with the right features, didn't specify a VCR in the correct format, etc., but at least they're making it to the right places!). |
Update Omicron to use the new Propolis VM creation API defined in oxidecomputer/propolis#813 and oxidecomputer/propolis#816. This API requires clients to pass instance specs to create new VMs and component replacement lists to migrate existing VMs. Construct these in sled agent for now; in the future this logic can move to Nexus and become part of a robust virtual platform abstraction. For now the goal is just to keep everything working for existing VMs while adapting to the new Propolis API. Slightly adjust the sled agent instance APIs so that Nexus specifies disks and boot orderings using sled-agent-defined types and not re-exported Propolis types. Finally, adapt Nexus to the fact that Crucible's `VolumeConstructionRequest` and `CrucibleOpts` types no longer appear in Propolis's generated client (and so don't appear in sled agent's client either). Instead, the `propolis-client` crate re-exports these types directly from its `crucible-client-types` dependency. For the most part, this involves updating `use` directives and storing `SocketAddr`s in their natively typed form instead of converting them to and from strings. Tests: cargo nextest, plus ad hoc testing in a dev cluster as described in the PR comments.
Stacked on #813. Related to #804.
Reviewer note: Most of this change is mechanical conversion of types, but there are some slightly more substantial functional changes to Crucible backend management in server.rs, initializer.rs, and state_driver.rs.
Today instance specs use
String
s to identify VM components. In many cases, components are associated with control plane objects (like Disk and NetworkInterface records) that have UUIDs. Allow components to be identified by UUID by defining aSpecKey
type that deserializes as aUuid
when possible and as aString
when not, then plumb it throughout Propolis. See the comments on theSpecKey
type for more details, including notes on why the type is a UUID/String union and not just a UUID.The main functional change in this PR is that propolis-server's component maps now use
SpecKey
s to identify components. The idea is to say that if a component has key "Foo" in a VM's spec, then subsequent API calls that want to act on that component should pass "Foo" as the component ID. This (hopefully) simplifies how Crucible-related APIs should designate the disk they're targeting: Omicron should pick an ID for each Crucible backend it requests (which ID can just be the control plane disk ID), then pass the same ID to subsequent snapshot and VCR replacement operations. Control plane disk names no longer appear in the Propolis API at all.Tests: cargo test, PHD, manual creation/migration of VMs with propolis-cli, manually hit the snapshot and VCR replacement endpoints and verified these dispatched their operations to the correct Crucible backend.
Fixes #776. Fixes #772.