Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add flag --enable_online_fs_expansion: control whether pods that refe… #238

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

huangzhiyong
Copy link

…rence the resized volume need to be restarted.

Kubernetes v1.11 also introduces an alpha feature called online file system expansion. This feature enables file system expansion while a volume is still in-use by a pod. Because this feature is alpha, it requires enabling the feature gate, ExpandInUsePersistentVolumes. It is supported by the in-tree volume plugins GCE-PD, AWS-EBS, Cinder, and Ceph RBD. When this feature is enabled, pod referencing the resized volume do not need to be restarted. Instead, the file system will automatically be resized while in use as part of volume expansion. File system expansion does not happen until a pod references the resized volume, so if no pods referencing the volume are running file system expansion will not happen.

@huangzhiyong
Copy link
Author

#237

@deepthi
Copy link
Collaborator

deepthi commented Dec 9, 2021

The DCO needs to be fixed. If the reviewers like the change, we can run CI.

…rence the resized volume need to be restarted.

Kubernetes v1.11 also introduces an alpha feature called online file system expansion. This feature enables file system expansion while a volume is still in-use by a pod. Because this feature is alpha, it requires enabling the feature gate, ExpandInUsePersistentVolumes. It is supported by the in-tree volume plugins GCE-PD, AWS-EBS, Cinder, and Ceph RBD. When this feature is enabled, pod referencing the resized volume do not need to be restarted. Instead, the file system will automatically be resized while in use as part of volume expansion. File system expansion does not happen until a pod references the resized volume, so if no pods referencing the volume are running file system expansion will not happen.

Signed-off-by: zhiyong.huang <[email protected]>
@huangzhiyong
Copy link
Author

DCO -> fixed

Copy link
Collaborator

@mattlord mattlord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution, @huangzhiyong ! Aside from my 1 comment/request, it makes sense to me. I'll let @dctrwatson comment though as he's much more familiar with the operator than I am at this point.

@@ -50,6 +50,7 @@ const (
var (
maxConcurrentReconciles = flag.Int("vitessshard_concurrent_reconciles", 10, "the maximum number of different vitessshards to reconcile concurrently")
resyncPeriod = flag.Duration("vitessshard_resync_period", 30*time.Second, "reconcile vitessshards with this period even if no Kubernetes events occur")
onlineFileSystemExpansion = flag.Bool("enable_online_fs_expansion", true, "if true, pod referencing the resized volume do not need to be restarted, but provided that the volume plug-in supports, such as GCE-PD, AWS-EBS, Cinder, and Ceph RBD")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO we should make this behavior opt-in because we have no way to know that this will work in a given environment, right? That would mean making the default false. I would recommend:

onlineFileSystemExpansion = flag.Bool("enable_online_fs_expansion", false, "if true, pods referencing the resized volume do not need to be restarted; you must ensure that the CSI driver(s) in use support online resizing, e.g. GCE-PD, AWS-EBS, Cinder, or Ceph RBD")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can at least wait 2 releases before changing the default

@@ -50,6 +50,7 @@ const (
var (
maxConcurrentReconciles = flag.Int("vitessshard_concurrent_reconciles", 10, "the maximum number of different vitessshards to reconcile concurrently")
resyncPeriod = flag.Duration("vitessshard_resync_period", 30*time.Second, "reconcile vitessshards with this period even if no Kubernetes events occur")
onlineFileSystemExpansion = flag.Bool("enable_online_fs_expansion", true, "if true, pod referencing the resized volume do not need to be restarted, but provided that the volume plug-in supports, such as GCE-PD, AWS-EBS, Cinder, and Ceph RBD")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can at least wait 2 releases before changing the default

@@ -105,7 +105,9 @@ func (r *ReconcileVitessShard) reconcileDisk(ctx context.Context, vts *planetsca

// If disk size has changed and the changes are all ready, mark the shard as ready to cascade. Otherwise, skip this.
if anythingChanged {
rollout.Cascade(vts)
if !*onlineFileSystemExpansion {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can move this check to the beginning of reconcileDisk where it checks vts.Spec.UpdateStrategy.Type and just return resultBuilder.Result() because I don't think this function provides any value when using onlineFileSystemExpansion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants