Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Key Vault] Create Security Domain library #37929

Draft
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

mccoyp
Copy link
Member

@mccoyp mccoyp commented Oct 16, 2024

Description

This is a new library that is mostly generated directly from TypeSpec. The goal of this library is to provide a track 2 Security Domain client for the CLI, as well as external customers. There are currently no plans for shipping this library in other languages.

Initial preview timeline: public preview release by October 22nd, 2024.

Customer library request: Azure/azure-cli#29998

All SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

@mccoyp mccoyp added KeyVault Client This issue points to a problem in the data-plane of the library. labels Oct 16, 2024
@azure-sdk
Copy link
Collaborator

API change check

APIView has identified API level changes in this PR and created following API reviews.

azure-keyvault-securitydomain

"""

@distributed_trace
def begin_download(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm rather confused by this method - it's very unusual to have a download function that is also a poller?
I see the _initial method is returning Iterable[bytes] like I would expect from a download, but this poller seems to return a result of None.....
If this method is not actually downloading any data locally, then we should probably find a different name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type here actually seems to be a generation bug -- the poller is supposed to return a SecurityDomainObject (and the track 1 library that the CLI uses does this correctly). Azure PowerShell and the CLI even accept an output file parameter for this operation, so returning None is definitely unexpected. I'm looking into that now

Copy link
Member Author

@mccoyp mccoyp Oct 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some more testing, this is a weird method. The initial response contains the SecurityDomainObject, even when get_download_status still returns "InProgress". The actual download operation doesn't seem to be an LRO, but the operation is marked as one because HSM activation, which is started by the operation, does take some time to complete.

This may have to do with this comment in the CLI's vendored package, stating the operation was apparently changed from returning a 200 to a 202 in the 7.2 API version: https://github.com/Azure/azure-cli/blob/424e80464cdba4c241fade9d966cce61ebe4513a/src/azure-cli/azure/cli/command_modules/keyvault/vendored_sdks/azure_keyvault_t1/v7_2/key_vault_client.py#L118-L122

The REST API documentation is correct -- the download endpoint just returns a value of the security domain. To make this work with our LRO logic though, I think we'd have to make this return SecurityDomainOperationStatus, and add a value property to that model. @heaths, @annatisch, what do you think?

EDIT: The get_download_status/downloadPending endpoint doesn't return a value containing the security domain -- it just returns the status of HSM activation. This operation wants to return the initial response as the final result once the status is "Success". I guess we still need a SecurityDomainObject return type, but I need to find a way to get this to work with TypeSpec...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a custom polling method to correctly handle this return type (and the non-standard "Success" status), and plugged it into an override for the begin_download operation: 4aac19c. The client method now agrees with the service specification and past behavior, though I'll need to sync with Laurent on whether there's anything we can do to correct the return type in the generated methods. Either way, we still need the custom polling to handle the LRO flow here.

return deserialized # type: ignore

@overload
def begin_upload(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not entirely sure if upload is the best name here - again it's a bit weird to pair this with a poller, but I'm assuming the "upload" triggers some process that the poller then monitors?

Copy link
Member Author

@mccoyp mccoyp Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For operation context, this uploads a security domain to an HSM. The CLI's documentation says this "start[s] to restore the HSM" -- the same command in Azure PowerShell uses Import as the verb. I don't have the full context, but it seems like the long-running process is restoring an HSM to some state

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a good documentation page that walks through the process of recovering a lost HSM by uploading its security domain to a new HSM: https://learn.microsoft.com/azure/key-vault/managed-hsm/disaster-recovery-guide.

To upload a security domain, you need to have activated a previous HSM and downloaded its security domain. Then the transfer key is fetched from a new, inactive HSM -- which is used to encrypt the original security domain. The encrypted domain is can then be uploaded to the new HSM, activating it in the process. Testing it locally, the upload operation takes about a minute; the operation is considered successful once the HSM is active.

Based on that context, upload makes some sense as a verb. restore could possibly be an option, like the CLI documentation mentions, but restoring is already used as a verb in azure-keyvault-administration to refer to restoring the backed-up contents of an HSM. export/import or download/upload seem like the best options to me

Comment on lines 201 to 208
delay = kwargs.pop("polling_interval", self._config.polling_interval)
return super().begin_download( # type: ignore[return-value]
certificate_info_object,
polling=SecurityDomainClientDownloadPollingMethod(
lro_algorithms=[SecurityDomainClientDownloadPolling()], timeout=delay
),
**kwargs,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hard-coded polling=SecurityDomainClientDownloadPollingMethod(...) will prevent CLI from bypassing LRO with polling=False

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Client This issue points to a problem in the data-plane of the library. KeyVault
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants