-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add design for only restore volume data. #7481
base: main
Are you sure you want to change the base?
Conversation
456ba46
to
c7fd291
Compare
c7fd291
to
192a22d
Compare
@blackpiglet, Can you please rebase the PR? It is currently showing what appears to be many unrelated commits. Few comments/questions:
I personally think that in-place restore is going to be more useful, especially if files can be selected for restore. If the whole PVC is being created any way, users will need to restart pods to attach the PVCs so it is almost like restoring pods themselves. |
@anshulahuja98 |
@draghuram First, could you share some thoughts about why the in-place restore is more useful? I recently also heard some requirements about the GitOps and DevOps pipeline scenario. In those cases, some tools guarantee that the workload in k8s always running as expected. The in-place restore is needed in those cases, but I don't know whether those are common cases for k8s usage. Second, to answer your question.
|
design/data_only_restore.md
Outdated
- Can filter the volumes that needs data restore. | ||
|
||
## Non Goals | ||
- Will not support the in-place volume data restore. To achieve data integrity, new PVC and PV are created, and wait for the user to mount the restored volume. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heard from US community meeting that this will be moved into Goals.
Would have to discuss how we're handling volumes that are already mounted.. or do we say, you have to pre-delete/scaledown so volumes are not mounted by restore time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kaovilai For in-place, we can't scale down unless we create a dummy pod to mount the PVC, since there must be a mounting pod for kopia to have access to it from the node agent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have dummy pod pattern from data mover work to follow
// DataOnly specify whether to only restore volume data and related k8s resources | ||
// without any other k8s resources. | ||
DataOnly *bool `json:"dataOnly"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work via CSI only? file system backup only? or both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use the data mover like approach where we write to a new pvc using file system backup through a dummy pod.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we go with the in-place restore, only filesystem backup is supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An advanced CSI in-place could auto patch deployments to a new pvc restored from snapshots.
design/data_only_restore.md
Outdated
// DataOnlyPvcMap is a map of backed-up PVC name and the data-only | ||
// restored PVC name. | ||
DataOnlyPvcMap map[types.NamespacedName]types.NamespaceName `json:"dataOnlyPvcMap,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be possible to add labelSelector to all resources that user expect to be swapping to restored PVC name, and having velero do the swapping on labeled (core k8s only for now) resources in the namespace.
kubectl label deployment deployName velero.io/sync-restored-pvc-name=<original-pvc-in-backup-name>
Any PVC restored using this method with a new name will have annotation velero.io/restored-pvc-original-name=<original name>
that velero can use as reference for subsequent backup/restore pvc name syncing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we go with the in-place way, the name-mapping logic is unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right.. for FSB with deleting PVC prior to restoring to the same name.
@blackpiglet, gitops is certainly one reason for in place restore. Another use case is the need to periodically update data in an alternate standby cluster. Finally, if some application files are deleted or corrupted, user may only want to restore those files. |
@draghuram Is this acceptable? Data-only restore cannot support snapshot-based backups. |
FSB only is acceptable. Tho I think restore from CSI snapshot to new PVC and then patching to use new PVC could be considered in-place with the caveat that it requires pod restart. |
Thanks for the quick response, then I will follow the filesystem-only way for now. |
8ebc939
to
ddd6fee
Compare
d9dbdc5
to
f80b339
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #7481 +/- ##
=======================================
Coverage 61.71% 61.71%
=======================================
Files 263 263
Lines 28869 28869
=======================================
Hits 17816 17816
Misses 9793 9793
Partials 1260 1260 ☔ View full report in Codecov by Sentry. |
I went through further over the discussion With that caveat, I want to push for CSI snapshot based in-place restore. Here in-place restore refers mainly to Detaching PVC from workload, deleting PVC and re-creating PVC. This is useful for Disaster recovery scenarios. |
This proposal also cannot avoid pod downtime, so I suppose it aims to make the data-only restore can support more types of backups, right?
If no downtime is a must-have feature, we can achieve that by ignoring the PodVolumeRestore's related pod InitContainer check logic, although that will compromise the data integrity, and data write could fail due to conflict. |
Signed-off-by: Xun Jiang <[email protected]>
f80b339
to
156685a
Compare
I just noticed that this design doesn't cover the consideration for WaitForFirstConsumer volumes, some discussion see here. |
Thank you for contributing to Velero!
Please add a summary of your change
Does your change fix a particular issue?
Fixes #(issue)
This is the design document for issue #7345.
Please indicate you've done the following:
/kind changelog-not-required
as a comment on this pull request.site/content/docs/main
.