Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configure detailed metrics #1576

Merged
merged 5 commits into from
Mar 18, 2024
Merged

Conversation

erthalion
Copy link
Contributor

@erthalion erthalion commented Feb 26, 2024

Description

Currently, Collector exposes two types of prometheus metrics: regular aggregates and per-syscall aggregates. The latter one is quite verbose and produces more than half of the whole output (181 out of 284 lines), thus it makes sense to make it configurable to be able to disable in the future.

From one side it would enable us to finally expose Collector prometheus metrics, at the same we still would be able to use per-syscall metrics for troubleshooting in development environment.

Checklist

  • Investigated and inspected CI test results
  • Updated documentation accordingly

Automated testing

  • Added unit tests
  • Added integration tests
  • Added regression tests

Testing Performed

Local testing.

Currently, Collector exposes two types of prometheus metrics: regular
aggregates and per-syscall aggregates. The latter one is quite verbose
and produces more than half of the whole output (181 out of 284 lines),
thus it makes sense to make it configurable to be able to disable in the
future.

From one side it would enable us to finally expose Collector prometheus
metrics, at the same we still would be able to use per-syscall metrics
for troubleshooting in development environment.
@erthalion erthalion requested a review from a team as a code owner February 26, 2024 14:07
Copy link
Collaborator

@Molter73 Molter73 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if it would make sense to have this as a run-time check instead of compile-time. Thinking from a user of collector, the compile time check would mean a new image needs to be pulled and deployed (maybe even built, since I don't expect us to maintain multiple versions of the image), whereas a run time check could be toggled via an env variable, have the container restart and presto! The metrics are now collected or ignored. We could still have it so these metrics are collected by default. WDYT?

@erthalion
Copy link
Contributor Author

Wonder if it would make sense to have this as a run-time check instead of compile-time. Thinking from a user of collector, the compile time check would mean a new image needs to be pulled and deployed (maybe even built, since I don't expect us to maintain multiple versions of the image), whereas a run time check could be toggled via an env variable, have the container restart and presto! The metrics are now collected or ignored. We could still have it so these metrics are collected by default. WDYT?

We can, but to be honest I'm not sure there is a use case for switching between short/detailed metrics mode at runtime. The latter one most likely would be used for low level troubleshooting during development (i.e. when some tricky Falco modifications are not playing nice), but how could they be useful in the field?

@Molter73
Copy link
Collaborator

but how could they be useful in the field?

I assume they could be used to troubleshoot problems with specific syscalls that might be being dropped, but honestly 🤷🏻‍♂️

@erthalion
Copy link
Contributor Author

Ok, don't see why not as well, so lets try it.

Copy link
Collaborator

@Molter73 Molter73 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Left a few comments, but nothing major left IMO.

collector/lib/CollectorStatsExporter.cpp Outdated Show resolved Hide resolved
docs/troubleshooting.md Outdated Show resolved Hide resolved
docs/references.md Outdated Show resolved Hide resolved
erthalion and others added 3 commits March 18, 2024 14:42
Co-authored-by: Mauro Ezequiel Moltrasio <[email protected]>
Co-authored-by: Mauro Ezequiel Moltrasio <[email protected]>
Co-authored-by: Mauro Ezequiel Moltrasio <[email protected]>
@erthalion erthalion merged commit 45bddeb into master Mar 18, 2024
51 checks passed
@erthalion erthalion deleted the feature/detailed-metrics-switch branch March 18, 2024 14:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants