Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Socket tracer and tcp stats connector attach conflicting BPF probes #2055

Open
ddelnano opened this issue Dec 5, 2024 · 0 comments
Open
Labels
area/datacollector Issues related to Stirling (datacollector)

Comments

@ddelnano
Copy link
Member

ddelnano commented Dec 5, 2024

The Socket tracer and TCP stats connector can't be enabled together. Since Vizier v0.14.12 (from #1989), the socket tracer adds a BPF probe to tcp_sendmsg which conflicts with the TCP stats connector's probe.

Reproducing the issue

Running the PEM or stirling_wrapper with the following cli flag --stirling_sources=socket_tracer,tcp_stats results in this error output:

I20241125 20:56:57.037268 3740444 source_connector.cc:35] Initializing source connector: tcp_stats
I20241125 20:56:57.037304 3740444 kernel_version.cc:82] Obtained Linux version string from `uname`: 5.15.0-1067-gke
I20241125 20:56:57.037322 3740444 linux_headers.cc:395] Detected kernel release (uname -r): 5.15.0-1067-gke
I20241125 20:56:57.037364 3740444 linux_headers.cc:206] Using Linux headers from: /lib/modules/5.15.0-1067-gke/build.
I20241125 20:56:57.037444 3740444 bcc_wrapper.cc:166] Initializing BPF program ...
I20241125 20:57:00.180481 3740444 scoped_timer.h:48] Timer(init_bpf_program) : 3.14 s
cannot create /var/tmp/bcc
WARNING: cannot get prog tag, ignore saving source with program tag
cannot attach kprobe, File exists
W20241125 20:57:00.208313 3740444 stirling.cc:416] Source Connector (registry name=tcp_stats) not instantiated, error: Internal : Unable to attach kprobe for tcp_sendmsg using probe_entry_tcp_sendmsg

Background

While BCC does support multiple kprobes for the same process and kernel function, this is only available via the perf_event kprobe API. BCC tries to optimistically use this API and falls back to the text based API (/sys/kernel/tracing/kprobe_events). In Pixie's case, the text based method is used since we specify maxactive. This is because the perf_event API doesn't support maxactive and as a result BCC always uses the text based kprobe (source).

Note: there was an effort to add maxactive support to the perf_event API (kernel patch). This change never made it upstream because of concerns with it being rendered obsolete with the newer rethook implementation.

Solutions to consider

  1. Add configuration toggle to disable the socket tracer from probing tcp_sendmsg
    • This would be considered a stop gap solution for users that want to run both source connectors while a longer term solution is implemented
  2. Use BCC's kfuncs [1] to allow for both probes to exist
    • This requires validation that kfuncs work with multiple probes but from my initial research this seems likely. It requires Linux 5.5+ so the tradeoff of how to leverage this would need to be thought through (e.g. tcp stats could be migrated and socket tracer could be left on kprobes since the former is not enabled in Pixie by default).

[1] Example application using BCC kfunc probe

@ddelnano ddelnano added the area/datacollector Issues related to Stirling (datacollector) label Dec 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/datacollector Issues related to Stirling (datacollector)
Projects
None yet
Development

No branches or pull requests

1 participant