Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase timeout for allow successful data re-balance on VSphere/Azure platforms #9994

Merged
merged 2 commits into from
Jul 10, 2024

Conversation

am-agrawa
Copy link
Contributor

We saw multiple failures on VSphere IPI for test test_add_capacity_ui with data re-balance issue however the test passes on AWS IPI. Increasing the re-balance timeout should stabilise it.

@am-agrawa am-agrawa self-assigned this Jun 27, 2024
@am-agrawa am-agrawa requested a review from a team as a code owner June 27, 2024 14:12
@am-agrawa
Copy link
Contributor Author

am-agrawa commented Jul 9, 2024

@am-agrawa am-agrawa added the Verified Mark when PR was verified and log provided label Jul 9, 2024
@@ -112,7 +112,7 @@ def add_capacity_test(ui_flag=False):
verify_storage_device_class(device_class)
verify_device_class_in_osd_tree(ct_pod, device_class)

check_ceph_health_after_add_capacity(ceph_rebalance_timeout=3600)
check_ceph_health_after_add_capacity(ceph_rebalance_timeout=5400)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not mandatory, but if the issue is only with vSphere IPI, maybe we can do something like this:

timeout = 3600 if is_vsphere_ipi_cluster() else 5400
check_ceph_health_after_add_capacity(ceph_rebalance_timeout=timeout)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, data re-balance issue is also seen with VSphere UPI and Azure IPI.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay

@am-agrawa am-agrawa requested a review from a team July 10, 2024 08:59
yitzhak12
yitzhak12 previously approved these changes Jul 10, 2024
ebenahar
ebenahar previously approved these changes Jul 10, 2024
@am-agrawa am-agrawa dismissed stale reviews from ebenahar and yitzhak12 via 13a1b2e July 10, 2024 09:23
@am-agrawa am-agrawa force-pushed the data-rebalance-fix branch from 38d5cef to 13a1b2e Compare July 10, 2024 09:23
@openshift-ci openshift-ci bot removed the lgtm label Jul 10, 2024
Copy link

openshift-ci bot commented Jul 10, 2024

New changes are detected. LGTM label has been removed.

@am-agrawa am-agrawa changed the title Increase timeout for allow successful data re-balance on VSphere IPI Increase timeout for allow successful data re-balance on VSphere/Azure platforms Jul 10, 2024
@am-agrawa am-agrawa added the lgtm label Jul 10, 2024
Copy link

openshift-ci bot commented Jul 10, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: am-agrawa, ebenahar, prsurve

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ebenahar ebenahar merged commit 495e7c2 into red-hat-storage:master Jul 10, 2024
5 of 6 checks passed
@am-agrawa
Copy link
Contributor Author

/cherry-pick release-4.15

@am-agrawa
Copy link
Contributor Author

/cherry-pick release-4.14

@am-agrawa
Copy link
Contributor Author

/cherry-pick release-4.13

@am-agrawa
Copy link
Contributor Author

/cherry-pick release-4.12

@openshift-cherrypick-robot
Copy link
Collaborator

@am-agrawa: #9994 failed to apply on top of branch "release-4.13":

Applying: increase timeout for data re-balance on vsphere ipi
Using index info to reconstruct a base tree...
A	tests/functional/z_cluster/cluster_expansion/test_add_capacity.py
Falling back to patching base and 3-way merge...
Auto-merging tests/manage/z_cluster/cluster_expansion/test_add_capacity.py
CONFLICT (content): Merge conflict in tests/manage/z_cluster/cluster_expansion/test_add_capacity.py
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 increase timeout for data re-balance on vsphere ipi
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherry-pick release-4.13

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-cherrypick-robot
Copy link
Collaborator

@am-agrawa: new pull request created: #10067

In response to this:

/cherry-pick release-4.15

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-cherrypick-robot
Copy link
Collaborator

@am-agrawa: new pull request created: #10068

In response to this:

/cherry-pick release-4.14

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-cherrypick-robot
Copy link
Collaborator

@am-agrawa: #9994 failed to apply on top of branch "release-4.12":

Applying: increase timeout for data re-balance on vsphere ipi
Using index info to reconstruct a base tree...
A	tests/functional/z_cluster/cluster_expansion/test_add_capacity.py
Falling back to patching base and 3-way merge...
Auto-merging tests/manage/z_cluster/cluster_expansion/test_add_capacity.py
CONFLICT (content): Merge conflict in tests/manage/z_cluster/cluster_expansion/test_add_capacity.py
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 increase timeout for data re-balance on vsphere ipi
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherry-pick release-4.12

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

amr1ta pushed a commit to amr1ta/ocs-ci that referenced this pull request Jul 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm size/XS Squad/Brown Verified Mark when PR was verified and log provided
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants