Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add standard for volume backup functionality #567

Merged
merged 21 commits into from
Sep 17, 2024

Conversation

markus-hentsch
Copy link
Contributor

No description provided.

@markus-hentsch markus-hentsch linked an issue Apr 16, 2024 that may be closed by this pull request
5 tasks
@markus-hentsch
Copy link
Contributor Author

Proposal written. Test script implemented.

Cleanup function of the test script needs some polish according to some erronous behavior I observed:

  • wait for resources to be in available state before attempting deletion
  • wait for backups to finish being cleaned up before starting volume deletion (dependency)
  • make errors in cleanup fatal for test execution (resoning: deletion should work just as fine as creation when API works correctly and we need a clean base to test on)

Copy link
Contributor

@josephineSei josephineSei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WE do have the problem that we should have both in mind: large deployments and edge cloud scenarios with only one rack somewhere. I am a bit undecided how to deal with a possible second storage backend in the latter case.

Standards/scs-XXXX-v1-volume-backup-service.md Outdated Show resolved Hide resolved
Standards/scs-XXXX-v1-volume-backup-service.md Outdated Show resolved Hide resolved
@markus-hentsch
Copy link
Contributor Author

Proposal written. Test script implemented.

Cleanup function of the test script needs some polish according to some erronous behavior I observed:

* wait for resources to be in available state before attempting deletion

* wait for backups to finish being cleaned up before starting volume deletion (dependency)

* make errors in cleanup fatal for test execution (resoning: deletion should work just as fine as creation when API works correctly and we need a clean base to test on)

Done and tested.

@markus-hentsch markus-hentsch marked this pull request as ready for review April 23, 2024 14:02
Copy link
Contributor

@anjastrunk anjastrunk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general. Just one addition requested in section "Conformance Tests".

Copy link
Contributor

@josephineSei josephineSei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think only this one question is left for me.

Standards/scs-XXXX-v1-volume-backup-service.md Outdated Show resolved Hide resolved
Co-authored-by: anjastrunk <[email protected]>
Signed-off-by: Markus Hentsch <[email protected]>
@berendt
Copy link
Member

berendt commented Jun 27, 2024

I think only this one question is left for me.

Wouldn't it even be a good idea to require backups to be stored on a different storage backend? Then there would always be a "Medienbruch".

So I would prefer that the backup function itself does not necessarily have to be available, but if it is available it always has a dedicated storage backend.

Then you can assume that if the function is available, it also enables real backups.

@neuroserve
Copy link

Just a quick feedback we got from a customer accustomed to VMWare: The backup functions (provided by Veeam) are integrated in the webfrontend (VCloud Director) there. And that's why they were looking for similar functions in Horizon, too.

@markus-hentsch
Copy link
Contributor Author

I think only this one question is left for me.

Wouldn't it even be a good idea to require backups to be stored on a different storage backend? Then there would always be a "Medienbruch".

So I would prefer that the backup function itself does not necessarily have to be available, but if it is available it always has a dedicated storage backend.

Then you can assume that if the function is available, it also enables real backups.

Very good point. I added a new choice to the "Options considered" section. I think it boils down to whether we want to favor feature availability vs. backup reliability.

Personally, I'd actually favor reliability now that I think about it due to the fact the term "backup" does imply some kind of availability guarantees.

@markus-hentsch
Copy link
Contributor Author

Just a quick feedback we got from a customer accustomed to VMWare: The backup functions (provided by Veeam) are integrated in the webfrontend (VCloud Director) there. And that's why they were looking for similar functions in Horizon, too.

Thanks for sharing this. @neuroserve do you happen to know whether VMWare/Veeam makes any assumptions or decisions regarding the location of backups in relation to the main storage, i.e., are backups always located in a separate storage backend?

@neuroserve
Copy link

AFAIK the Veeam solution is quite flexible and can use "any" S3-Backend (for example). The backup destination is not tied to the infrastructure, where the VCloud Director is located.

@markus-hentsch
Copy link
Contributor Author

The discussion in the IaaS call on 2024-07-03 did not produce a clear winner among the options (feature availability vs. backup reliability).
However, a third option was proposed:

  • option c)
    • mandate backup feature and API availability AND make it transparent to users ("discoverability") how much additional reliability is provided
      • place of discoverability is not obvious (no place for metadata in the cinder backup API)
        • api extensions?
        • Gaia-X self-descriptionsVCs
        • new discoverability service that we work on introducing for flavors (-> public cloud SIG draft spec) spec

... which also favors feature availability while trying to make the potential shortcomings in backup reliability transparent for the user in case a separate storage backend is not configured for backups by the CSP.
This however requires some kind of self-description functionality as stated, which we do not have yet.

There has been no further feedback since then. The current iteration of the standard draft implements the feature availability option, which also provides the basis for the newly suggested option described above. For now, I will keep it that way.

Copy link
Contributor

@josephineSei josephineSei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good. Most comments are for the reference of OpenStack-sepcific terms, where we might be more broad to also integrate other implementations of the APIs.

Standards/scs-XXXX-v1-volume-backup-service.md Outdated Show resolved Hide resolved
Standards/scs-XXXX-v1-volume-backup-service.md Outdated Show resolved Hide resolved
Standards/scs-XXXX-v1-volume-backup-service.md Outdated Show resolved Hide resolved
Standards/scs-XXXX-v1-volume-backup-service.md Outdated Show resolved Hide resolved
Standards/scs-XXXX-v1-volume-backup-service.md Outdated Show resolved Hide resolved
In an SCS cloud, the volume backup functionality MUST be configured properly and its API as defined per `/v3/{project_id}/backups` MUST be offered to customers.
If using Cinder, a suitable [backup driver](https://docs.openstack.org/cinder/latest/configuration/block-storage/backup-drivers.html) MUST be set up.

The volume backup target storage SHOULD be a separate storage system from the one used for volumes themselves.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it is not possible to know as a user, whether a volume backup storage is different from the normal volume storage. We should encourage CSPs to give that information to customers. This could also be done via gaiax credentials, couldn't it? Maybe we need another issue for this part.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would require a deep dive into the Gaia-X Ontology to figure out whether this can be expressed using Gaia-X Credentials (formerly Self-Descriptions) appropriately.

As long as we don't have a standardized and proven way of representing arbitrary self-description information about an SCS cloud and its services in Gaia-X Credentials, I'm hesitant to add any such suggestions to the standard yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I thought so - Maybe we can discuss in the next Standardization SIG meeting or tomorrow, if and how we can use the gaia-x credentials for tests.

@anjastrunk anjastrunk added standards Issues / ADR / pull requests relevant for standardization & certification SCS-VP10 Related to tender lot SCS-VP10 labels Sep 2, 2024
Copy link
Contributor

@josephineSei josephineSei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me.
Someone from a CSP or from the IaaS team should also approve.

@markus-hentsch markus-hentsch merged commit e9d26bc into main Sep 17, 2024
6 of 7 checks passed
@markus-hentsch markus-hentsch deleted the issue/541-volume-backup-standard branch September 17, 2024 08:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
SCS-VP10 Related to tender lot SCS-VP10 standards Issues / ADR / pull requests relevant for standardization & certification
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

IaaS standard on user data backup
5 participants