Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perform Mon & OSD failure tests in Stretch cluster #9319

Merged
merged 6 commits into from
Jul 19, 2024

Conversation

mashetty330
Copy link
Contributor

No description provided.


request.addfinalizer(finalizer)


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add all the relevant markers, polarion id, stretch cluster required, lso required etc..

def test_single_mon_failures(self):
"""
Test mon failure with IO in the background

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the detailed steps

@mashetty330 mashetty330 requested a review from a team as a code owner July 8, 2024 08:44
Copy link

@ocs-ci ocs-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR validation on existing cluster

Cluster Name: mashetty-stj08
Cluster Configuration:
PR Test Suite:
PR Test Path: tests/functional/disaster-recovery/sc_arbiter/test_mon_osd_failures.py
Additional Test Params:
OCP VERSION: 4.16
OCS VERSION: 4.16
tested against branch: master

Job UNSTABLE (some or all tests failed).

Copy link

@ocs-ci ocs-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR validation on existing cluster

Cluster Name: mashetty-stj08
Cluster Configuration:
PR Test Suite:
PR Test Path: tests/functional/disaster-recovery/sc_arbiter/test_mon_osd_failures.py
Additional Test Params:
OCP VERSION: 4.16
OCS VERSION: 4.16
tested against branch: master

Job UNSTABLE (some or all tests failed).

Copy link

@ocs-ci ocs-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR validation on existing cluster

Cluster Name: mashetty-stj08
Cluster Configuration:
PR Test Suite:
PR Test Path: tests/functional/disaster-recovery/sc_arbiter/test_mon_osd_failures.py
Additional Test Params:
OCP VERSION: 4.16
OCS VERSION: 4.16
tested against branch: master

Job UNSTABLE (some or all tests failed).

Signed-off-by: Mahesh Shetty <[email protected]>
Signed-off-by: Mahesh Shetty <[email protected]>
Copy link

@ocs-ci ocs-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR validation on existing cluster

Cluster Name: mashetty-stj08
Cluster Configuration:
PR Test Suite:
PR Test Path: tests/functional/disaster-recovery/sc_arbiter/test_mon_osd_failures.py
Additional Test Params:
OCP VERSION: 4.16
OCS VERSION: 4.16
tested against branch: master

Job UNSTABLE (some or all tests failed).

Copy link

@ocs-ci ocs-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR validation on existing cluster

Cluster Name: mashetty-stj08
Cluster Configuration:
PR Test Suite:
PR Test Path: tests/functional/disaster-recovery/sc_arbiter/test_mon_osd_failures.py
Additional Test Params:
OCP VERSION: 4.16
OCS VERSION: 4.16
tested against branch: master

Job PASSED.

Signed-off-by: Mahesh Shetty <[email protected]>
@mashetty330 mashetty330 added the Verified Mark when PR was verified and log provided label Jul 10, 2024
Copy link

@ocs-ci ocs-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR validation on existing cluster

Cluster Name: mashetty-stj08
Cluster Configuration:
PR Test Suite:
PR Test Path: tests/functional/disaster-recovery/sc_arbiter/test_mon_osd_failures.py
Additional Test Params:
OCP VERSION: 4.16
OCS VERSION: 4.16
tested against branch: master

Job UNSTABLE (some or all tests failed).

Copy link

@ocs-ci ocs-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR validation on existing cluster

Cluster Name: mashetty-stj08
Cluster Configuration:
PR Test Suite:
PR Test Path: tests/functional/disaster-recovery/sc_arbiter/test_mon_osd_failures.py
Additional Test Params:
OCP VERSION: 4.16
OCS VERSION: 4.16
tested against branch: master

Job UNSTABLE (some or all tests failed).


def finalizer():
"""
Check for data loss, data corruption at the end of the tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding it in finalizer , we can make common function so that all can use in future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is already available in common function in Stretchcluster class. Here i used finalizer in class scoped fixture because i want the check to happen when all the test executions are completed. Hence avoiding repetitve checks for each test execution

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay

logger.info(
"Some app pods are not running, so trying the work-around to make them `Running`"
)
pods_not_running = get_not_running_pods(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even pods are showing known errors mentioned in recover_workload_pods_post_recovery function, If some pods are not running then it could be due to issue as well with same reasons.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didnt get you @avd-sagare

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean pods error message could be same as mentioned in recover_workload_pods_post_recovery functions but reason could be different.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the workaround works then the reason for the issue is same. if it doesn't then its different and in that case test will fail

pods_not_running = get_not_running_pods(
namespace=constants.STRETCH_CLUSTER_NAMESPACE
)
recover_workload_pods_post_recovery(sc_obj, pods_not_running)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this function handles if pod is failed with error which is not listed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it would fail even after applying the workaround if pods are not running


@turquoise_squad
@stretchcluster_required
@pytest.mark.usefixtures("setup_logwriter_workloads")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No bugzilla id?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why bugzilla?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not a customer bug automation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aha ,right

Signed-off-by: Mahesh Shetty <[email protected]>
Signed-off-by: Mahesh Shetty <[email protected]>
Copy link

@ocs-ci ocs-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR validation on existing cluster

Cluster Name: mashetty-stj16
Cluster Configuration:
PR Test Suite:
PR Test Path: tests/functional/disaster-recovery/sc_arbiter/test_mon_osd_failures.py
Additional Test Params:
OCP VERSION: 4.17
OCS VERSION: 4.17
tested against branch: master

Job PASSED.

@mashetty330 mashetty330 self-assigned this Jul 17, 2024
@mashetty330 mashetty330 requested a review from Akarsha-rai July 17, 2024 09:35
Copy link

openshift-ci bot commented Jul 19, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Akarsha-rai, avd-sagare, mashetty330, PrasadDesala

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@PrasadDesala PrasadDesala merged commit 4120623 into red-hat-storage:master Jul 19, 2024
6 of 7 checks passed
amr1ta pushed a commit to amr1ta/ocs-ci that referenced this pull request Jul 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm size/L PR that changes 100-499 lines team/e2e E2E team related issues/PRs Verified Mark when PR was verified and log provided
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants