-
Notifications
You must be signed in to change notification settings - Fork 376
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
tetragon: Add docs for tetragon performance stats
Adding details on 'tetra debug progs' and overhead metrics. Signed-off-by: Jiri Olsa <[email protected]>
- Loading branch information
Showing
1 changed file
with
172 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,172 @@ | ||
--- | ||
title: "BPF programs statistics" | ||
weight: 5 | ||
description: "Monitor bpf programs statistics" | ||
--- | ||
|
||
This page shows you how to monitor bpf programs statistics. | ||
|
||
## Concept | ||
|
||
The BPF subsystem provides performance data for each loaded program and tetragon | ||
exports that in metrics or display that in terminal in top like tool. | ||
|
||
## In terminal | ||
|
||
The tetra command allows to display loaded BPF programs in terminal with: | ||
|
||
``` | ||
# tetra debug progs | ||
``` | ||
|
||
The default output shows tetragon programs only and looks like: | ||
|
||
``` | ||
2024-10-31 11:12:45.94715546 +0000 UTC m=+8.038098448 | ||
Ovh(%) Id Cnt Time Name Pin | ||
0.00 22201 0 0 event_execve /sys/fs/bpf/tetragon/__base__/event_execve/prog | ||
0.00 22198 0 0 event_exit_acct_process /sys/fs/bpf/tetragon/__base__/event_exit/prog | ||
0.00 22200 0 0 event_wake_up_new_task /sys/fs/bpf/tetragon/__base__/kprobe_pid_clear/prog | ||
0.00 22207 0 0 tg_cgroup_rmdir /sys/fs/bpf/tetragon/__base__/tg_cgroup_rmdir/prog | ||
0.00 22206 0 0 tg_kp_bprm_committing_creds /sys/fs/bpf/tetragon/__base__/tg_kp_bprm_committing_creds/prog | ||
0.00 22221 0 0 generic_kprobe_event /sys/fs/bpf/tetragon/syswritefollowfdpsswd/generic_kprobe/__x64_sys_close/prog | ||
0.00 22225 0 0 generic_kprobe_event /sys/fs/bpf/tetragon/syswritefollowfdpsswd/generic_kprobe/__x64_sys_write/prog | ||
0.00 22211 0 0 generic_kprobe_event /sys/fs/bpf/tetragon/syswritefollowfdpsswd/generic_kprobe/fd_install/prog | ||
``` | ||
|
||
The fields have following meaning: | ||
|
||
- `Ovh` is system wide overhead of the BPF program | ||
- `Id` is global BPF ID of the program (as shown by `bpftool prog`) | ||
- `Cnt` is count with number of BPF program executions | ||
- `Time` is sum of the time of all BPF program executions | ||
- `Pin` is BPF program pin path in bpfffs | ||
|
||
It's possible to display all BPF programs with `--all`: | ||
|
||
``` | ||
# tetra debug progs --all | ||
``` | ||
|
||
That has following output: | ||
|
||
``` | ||
2024-10-31 11:19:37.720137195 +0000 UTC m=+7.165535117 | ||
Ovh(%) Id Cnt Time Name Pin | ||
0.00 159 2 82620 event_execve - | ||
0.00 171 68 18564 iter - | ||
0.00 158 2 10170 event_wake_up_n - | ||
0.00 164 2 4254 tg_kp_bprm_comm - | ||
0.00 157 2 3868 event_exit_acct - | ||
0.00 97 2 1680 - | ||
0.00 35 2 1442 - | ||
0.00 83 0 0 sd_devices - | ||
0.00 9 0 0 - | ||
0.00 7 0 0 - | ||
0.00 8 0 0 - | ||
0.00 87 0 0 sd_devices - | ||
... | ||
``` | ||
|
||
Above commands should run properly on top of the tetragon sources. | ||
At the moment to run it properly under K8S you need to specify extra | ||
directory flags: | ||
|
||
``` | ||
# kubectl exec -ti -n kube-system tetragon-66rk4 -c tetragon -- tetra debug progs --bpf-dir /run/cilium/bpffs/tetragon/ --all --bpf-lib /var/lib/tetragon/ | ||
``` | ||
|
||
Note that there are other options to customize the behaviour: | ||
|
||
``` | ||
# tetra debug progs --help | ||
Retrieve information about BPF programs on the host. | ||
Examples: | ||
- tetragon BPF programs top style | ||
# tetra debug progs | ||
- all BPF programs top style | ||
# tetra debug progs --all | ||
- one shot mode (displays one interval data) | ||
# tetra debug progs --once | ||
- change interval to 10 seconds | ||
# tetra debug progs --timeout 10 | ||
- change interval to 10 seconds in one shot mode | ||
# tetra debug progs --once --timeout 10 | ||
Usage: | ||
tetra debug progs [flags] | ||
Aliases: | ||
progs, top | ||
Flags: | ||
--all Get all programs | ||
--bpf-dir string Location of bpffs tetragon directory (default "/sys/fs/bpf/tetragon") | ||
--bpf-lib string Location of Tetragon libs (btf and bpf files) (default "bpf/objs/") | ||
-h, --help help for progs | ||
--no-clear Do not clear screen between rounds | ||
--once Run in one shot mode | ||
--timeout int Interval in seconds (delay in one shot mode) (default 1) | ||
``` | ||
|
||
## Metrics | ||
|
||
The BPF subsystem provides performance data for each loaded program | ||
and tetragon exports that in metrics. | ||
|
||
For each loaded BPF program we get: | ||
- `run count` which counts how many times the BPF program was executed | ||
- `run time` which sums the time BPF program spent in all its executions | ||
|
||
|
||
Hence for each loaded BPF program we export 2 related metrics: | ||
|
||
- `tetragon_overhead_time_program_total[namespace,policy,sensor,attach]` | ||
- `tetragon_overhead_cnt_program_total[namespace,policy,sensor,attach]` | ||
|
||
|
||
Each loaded program is identified by labels: | ||
|
||
- `namespace` is policy K8S namespace | ||
- `policy` is policy name | ||
- `sensor` is sensor name | ||
- `attach` is program attachment name | ||
|
||
|
||
If we have `generic_kprobe` sensor attached on `__x64_sys_close` kernel function | ||
under `syswritefollowfdpsswd` policy, the related metrics will look like: | ||
|
||
``` | ||
tetragon_overhead_program_runs_total{attach="__x64_sys_close",policy="syswritefollowfdpsswd",policy_namespace="",sensor="generic_kprobe"} 15894 | ||
tetragon_overhead_program_seconds_total{attach="__x64_sys_close",policy="syswritefollowfdpsswd",policy_namespace="",sensor="generic_kprobe"} 1.03908217e+08 | ||
``` | ||
|
||
|
||
## Limitations | ||
|
||
Note that the BPF programs statistics are not enabled by default, because they introduce extra overhead, | ||
so it's necessary to enable them manually. | ||
|
||
- Either with `sysctl`: | ||
|
||
``` | ||
# sysctl kernel.bpf_stats_enabled=1 | ||
``` | ||
|
||
and make sure you disable the stats when it's no longer needed: | ||
|
||
``` | ||
# sysctl kernel.bpf_stats_enabled=0 | ||
``` | ||
|
||
- Or with following `tetra` command: | ||
|
||
``` | ||
# tetra debug enable-stats | ||
^C | ||
``` | ||
|
||
where the stats are enabled as long as the command is running (sleeping really). |