Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 E2E failures in CI #440

Merged
merged 1 commit into from
Sep 16, 2024
Merged

Conversation

Danil-Grigorev
Copy link
Contributor

@Danil-Grigorev Danil-Grigorev commented Sep 13, 2024

What this PR does / why we need it:
This PR fixes e2e test failures, related to:

  • Metrics and pod logs collection.
  • MachineDeployment checks for running machines. MachineSets are picked at random, as they are indistinguishable based on labels, and belong to the same MachineDeployment. This causes flakes as old MachineSet is expected to scale accordingly, while the new one performed it instead.
  • Increased ClusterClass apply timeouts. CAPD webhooks may take longer to stand up.
  • Improved failure reason reporting for WaitForClusterToUpgrade

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #439

Special notes for your reviewer:

Checklist:

  • squashed commits into logical changes
  • includes documentation
  • adds unit tests
  • adds or updates e2e tests

@Danil-Grigorev Danil-Grigorev added the kind/bug Something isn't working label Sep 13, 2024
@Danil-Grigorev Danil-Grigorev requested a review from a team as a code owner September 13, 2024 12:51
Removed:
- Metrics and pod logs collection. Crust gather collects logs for all
  resources.

Fixed:
- MachineDeployment checks for running machines. MachineSets are picked
  at random, as they are indistinguishable based on labels, and belong
  to the same MachineDeployment. This causes flakes as old MachineSet is
  expected to scale accordingly, while the new one performed it instead.
- Increased ClusterClass apply timeouts. CAPD webhooks may take longer
  to stand up.

Signed-off-by: Danil-Grigorev <[email protected]>
@Danil-Grigorev Danil-Grigorev merged commit 465f030 into rancher:main Sep 16, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failing e2e tests on main
3 participants