Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release-4.14] OCPBUGS-43743: Soften haproxy timeout for kubeapi probe #4664

Open
wants to merge 1 commit into
base: release-4.14
Choose a base branch
from

Conversation

openshift-cherrypick-robot

This is an automated cherry-pick of #4657

/assign openshift-ci-robot

This PR changes timeouts used by haproxy when deciding whether the
master backend (i.e. k8s api server) is dead or alive.

The previous probe was relatively strict, allowing for a very fast
failover but at the same time very prone to temporary flakiness.

The new configuration aligns haproxy with the readiness probe used by
k8s when detecting if pod is dead or alive. Aligning those
configurations removes the mismatch we have when k8s believes api server
is ready but haproxy sees it as dead.

A consequence of this change is a potential increase of the downtime
when api server is forcefully removed. In the worst case scenario we may
see unavailability for 15 seconds. This should not be happening much in
a real setups, but for the sake of completeness this should be noted.

Fixes: OCPBUGS-43428
@openshift-ci-robot
Copy link
Contributor

@openshift-cherrypick-robot: Detected clone of Jira Issue OCPBUGS-43719 with correct target version. Will retitle the PR to link to the clone.
/retitle [release-4.14] OCPBUGS-43743: Soften haproxy timeout for kubeapi probe

In response to this:

This is an automated cherry-pick of #4657

/assign openshift-ci-robot

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot changed the title [release-4.14] OCPBUGS-43719: Soften haproxy timeout for kubeapi probe [release-4.14] OCPBUGS-43743: Soften haproxy timeout for kubeapi probe Oct 26, 2024
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 26, 2024
@openshift-ci-robot
Copy link
Contributor

@openshift-cherrypick-robot: This pull request references Jira Issue OCPBUGS-43743, which is invalid:

  • expected dependent Jira Issue OCPBUGS-43742 to be in one of the following states: VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA), but it is New instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

This is an automated cherry-pick of #4657

/assign openshift-ci-robot

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Oct 26, 2024
Copy link
Contributor

openshift-ci bot commented Oct 26, 2024

@openshift-cherrypick-robot: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-vsphere-ovn-zones 21bc3c4 link false /test e2e-vsphere-ovn-zones
ci/prow/e2e-gcp-rt 21bc3c4 link false /test e2e-gcp-rt
ci/prow/e2e-gcp-op-single-node 21bc3c4 link true /test e2e-gcp-op-single-node
ci/prow/e2e-gcp-ovn-rt-upgrade 21bc3c4 link false /test e2e-gcp-ovn-rt-upgrade
ci/prow/e2e-gcp-op 21bc3c4 link true /test e2e-gcp-op
ci/prow/okd-scos-e2e-aws-ovn 21bc3c4 link false /test okd-scos-e2e-aws-ovn

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@mkowalski
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 28, 2024
Copy link
Contributor

openshift-ci bot commented Oct 28, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mkowalski, openshift-cherrypick-robot
Once this PR has been reviewed and has the lgtm label, please assign yuqi-zhang for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants