Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backports/v1.0: Add a metric to provide per-event missed events #1702

Merged
merged 1 commit into from
Nov 3, 2023

Conversation

tpapagian
Copy link
Member

[upstream commit d5a7ee2]

Example:
$ curl localhost:2112/metrics 2> /dev/null | grep 'sent_events_total|missed_events_total|ringbuf_perf_event_lost_total|ringbuf_queue_lost_total|msg_op_total|ringbuf_queue_received_total' tetragon_missed_events_total{msg_op="13"} 73300
tetragon_missed_events_total{msg_op="23"} 28
tetragon_missed_events_total{msg_op="24"} 606
tetragon_missed_events_total{msg_op="5"} 20
tetragon_missed_events_total{msg_op="7"} 22
tetragon_msg_op_total{msg_op="13"} 4.268532e+06
tetragon_msg_op_total{msg_op="23"} 12444
tetragon_msg_op_total{msg_op="24"} 2110
tetragon_msg_op_total{msg_op="5"} 11908
tetragon_msg_op_total{msg_op="7"} 12447
tetragon_ringbuf_perf_event_lost_total 73976
tetragon_ringbuf_queue_lost_total 0
tetragon_ringbuf_queue_received_total 4.307441e+06

This PR adds an eBPF map collector for getting metrics directly from a map. This map contains information about the return values of all perf_event_output calls (i.e. if it fails). This provides us the ability to determine missed events per type. Metric tetragon_missed_events_total contains such information.

Using the previous example, we can see that we lost 73976 events from the user-space (tetragon_ringbuf_perf_event_lost_total). This is the same as the sum of all tetragon_missed_events_total metrics gathered from the kernel.

[upstream commit d5a7ee2]

Example:
$ curl localhost:2112/metrics 2> /dev/null | grep 'sent_events_total\|missed_events_total\|ringbuf_perf_event_lost_total\|ringbuf_queue_lost_total\|msg_op_total\|ringbuf_queue_received_total'
tetragon_missed_events_total{msg_op="13"} 73300
tetragon_missed_events_total{msg_op="23"} 28
tetragon_missed_events_total{msg_op="24"} 606
tetragon_missed_events_total{msg_op="5"} 20
tetragon_missed_events_total{msg_op="7"} 22
tetragon_msg_op_total{msg_op="13"} 4.268532e+06
tetragon_msg_op_total{msg_op="23"} 12444
tetragon_msg_op_total{msg_op="24"} 2110
tetragon_msg_op_total{msg_op="5"} 11908
tetragon_msg_op_total{msg_op="7"} 12447
tetragon_ringbuf_perf_event_lost_total 73976
tetragon_ringbuf_queue_lost_total 0
tetragon_ringbuf_queue_received_total 4.307441e+06

This PR adds an eBPF map collector for getting metrics directly from a
map. This map contains information about the return values of all
perf_event_output calls (i.e. if it fails). This provides us the
ability to determine missed events per type. Metric
tetragon_missed_events_total contains such information.

Using the previous example, we can see that we lost 73976 events from
the user-space (tetragon_ringbuf_perf_event_lost_total). This is the same
as the sum of all tetragon_missed_events_total metrics gathered from the
kernel.

Signed-off-by: Anastasios Papagiannis <[email protected]>
@tpapagian tpapagian requested a review from a team as a code owner November 2, 2023 12:58
@tpapagian tpapagian requested a review from kevsecurity November 2, 2023 12:58
@tpapagian tpapagian changed the base branch from main to v1.0 November 2, 2023 12:58
@tpapagian tpapagian added area/metrics Related to prometheus metrics release-note/minor This PR introduces a minor user-visible change labels Nov 2, 2023
@tpapagian tpapagian requested a review from kkourt November 3, 2023 08:38
@kkourt kkourt merged commit 9eacb40 into v1.0 Nov 3, 2023
31 of 34 checks passed
@kkourt kkourt deleted the pr/apapag/backport_1674 branch November 3, 2023 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/metrics Related to prometheus metrics release-note/minor This PR introduces a minor user-visible change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants