Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use gRPC-based liveness probe instead of tetra status #2478

Merged
merged 2 commits into from
May 29, 2024

Conversation

tpapagian
Copy link
Member

@tpapagian tpapagian commented May 28, 2024

Please ensure your pull request adheres to the following guidelines:

  • For first time contributors, read Submitting a pull request.
  • All code is covered by unit and/or end-to-end tests tests where feasible.
  • All commits contain a well written commit message including a title,
    description and a Fixes: #XXX line if the commit addresses a particular
    GitHub issue.
  • All commits are signed off. See the section Developer’s Certificate of Origin
  • Provide a title or release-note blurb suitable for the release notes.
  • Are you a user of Tetragon? Please add yourself to the Users doc in the Cilium repository.
  • Thanks for contributing!

Now, we use tetra status command to report the status of tetragon agent. This comes with some overheads as tetra binary has a lot of additional functionality and it seems like an overkill to use that for status reporting.

On the other hand, k8s supports liveness probes by using an gRPC-based server. This first patch creates a new gRPC health server to report agent status. The second patch changes the helm chart to make use of that.

Use gRPC-based liveness probe instead of tetra status.

Copy link

netlify bot commented May 28, 2024

Deploy Preview for tetragon ready!

Name Link
🔨 Latest commit 4f795cb
🔍 Latest deploy log https://app.netlify.com/sites/tetragon/deploys/6656e59283a32700089f426c
😎 Deploy Preview https://deploy-preview-2478--tetragon.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@tpapagian tpapagian force-pushed the pr/apapag/grpc_liveness branch from 658a99a to 513a5c1 Compare May 28, 2024 20:24
@tpapagian tpapagian added the release-note/misc This PR makes changes that have no direct user impact. label May 28, 2024
@tpapagian tpapagian changed the title Pr/apapag/grpc liveness Use gRPC-based liveness probe instead of tetra status May 28, 2024
@tpapagian tpapagian marked this pull request as ready for review May 28, 2024 20:42
@tpapagian tpapagian requested a review from a team as a code owner May 28, 2024 20:42
@tpapagian tpapagian requested a review from mtardy May 28, 2024 20:42
Copy link
Contributor

@michi-covalent michi-covalent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lambdanis have we already started writing version specific upgrade guide? it might make sense to document this change, something like:

In v1.2 tetragon container uses the grpc liveness probe by default. To continue using "tetra status" for liveness probe,
specify tetragon.livenessProbe Helm value. For example:

tetragon:
  livenessProbe:
     timeoutSeconds: 60
     exec:
       command:
       - tetra
       - status
       - --server-address
       - "54321"
       - --retries
       - "5"

cmd/tetragon/main.go Outdated Show resolved Hide resolved
cmd/tetragon/main.go Outdated Show resolved Hide resolved
install/kubernetes/tetragon/values.yaml Outdated Show resolved Hide resolved
@tpapagian tpapagian force-pushed the pr/apapag/grpc_liveness branch 3 times, most recently from 4f795cb to 5caad77 Compare May 29, 2024 08:24
Copy link
Contributor

@lambdanis lambdanis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One Helm detail, but looks good, thanks!

@tpapagian tpapagian force-pushed the pr/apapag/grpc_liveness branch from 5caad77 to c693503 Compare May 29, 2024 10:16
Now, we use tetra status command to report the status of tetragon
agent. This comes with some overheads as tetra binary has a lot of
additional functionality and it seems like an overkill to use that for
status reporting.

On the other hand, k8s supports liveness probes by using an gRPC
endpoint (i.e.
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-a-grpc-liveness-probe).
This patch first creates a dedicated gRPC server to report agent status that
can be used for the liveness probe.

Signed-off-by: Anastasios Papagiannis <[email protected]>
@tpapagian tpapagian force-pushed the pr/apapag/grpc_liveness branch from c693503 to 722cb5c Compare May 29, 2024 10:18
The previous commit introduced a gRPC server that can be used for
the liveness probe. This patch changes helm to make that default instead
of the tetra status based liveness probe.

The user can still use the tetra status based liveness probe by
defining a values file similar to:
tetragon:
  livenessProbe:
     timeoutSeconds: 60
     exec:
       command:
       - tetra
       - status
       - --server-address
       - "54321"
       - --retries
       - "5"

Signed-off-by: Anastasios Papagiannis <[email protected]>
@tpapagian tpapagian force-pushed the pr/apapag/grpc_liveness branch from 722cb5c to 9ebe488 Compare May 29, 2024 10:20
@tpapagian tpapagian merged commit 2252857 into main May 29, 2024
43 checks passed
@tpapagian tpapagian deleted the pr/apapag/grpc_liveness branch May 29, 2024 11:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants