-
Notifications
You must be signed in to change notification settings - Fork 441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warn users that kernel headers are missing during the Pixie install process #2051
Labels
Comments
ddelnano
added
kind/feature
New feature or request
area/deployment
Issues replated to deployments
labels
Dec 2, 2024
Merged
3 tasks
ddelnano
added a commit
that referenced
this issue
Dec 11, 2024
… to `GetAgentStatus` (#2052) Summary: Add UDTF that detects linux kernel header installation and add column to `GetAgentStatus` This is a prerequisite to accomplish #2051. The `px deploy` command uses the GetAgentStatus UDTF in its final [healthcheck step](https://github.com/pixie-io/pixie/blob/854062111cf4b91a40649a2e2647c88c0a68b0db/src/pixie_cli/pkg/cmd/deploy.go#L607-L613). With this kernel header detection in place, the `px` cli can use the results from the `px/agent_status` script to print a warning message if kernel headers aren't detected. The helm install flow needs to be covered as well. My hope is that this UDTF could be used for that use case as well, but I need to further investigate the details of that. Relevant Issues: #2051 Type of change: /kind feature Test Plan: Skaffolded to a Ubuntu GKE cluster and tested the following - [x] Kelvin always reports `false` as it doesn't bind mount `/` to `/host` - [x] PEM running on host without `linux-headers-$(uname -r)` package reports `false` - [x] PEM running on host with `linux-headers-$(uname -r)` package reports `true` ``` $ gcloud compute ssh gke-dev-cluster-ddelnano-default-pool-a27c1ac2-x5k2 --internal-ip -- 'ls -alh /lib/modules/$(uname -r)/build' lrwxrwxrwx 1 root root 38 Aug 9 15:25 /lib/modules/5.15.0-1065-gke/build -> /usr/src/linux-headers-5.15.0-1065-gke $ gcloud compute ssh gke-dev-cluster-ddelnano-default-pool-a27c1ac2-j6pg --internal-ip -- 'ls -alh /lib/modules/$(uname -r)/build' ls: cannot access '/lib/modules/5.15.0-1065-gke/build': No such file or directory ``` ![Screen Shot 2024-12-02 at 9 30 29 AM](https://github.com/user-attachments/assets/9fa862f8-5a6c-46d6-8899-bfaf2bdf3371) Changelog Message: Add `GetLinuxHeadersStatus` UDTF and add `kernel_headers_installed` column to `GetAgentStatus` --------- Signed-off-by: Dom Del Nano <[email protected]>
This was referenced Dec 14, 2024
aimichelle
pushed a commit
that referenced
this issue
Dec 16, 2024
…ing (#2061) Summary: Update `GetAgentStatus` and kernel header UDTF to allow kelvin filtering In order to leverage the `GetAgentStatus`'s `kernel_headers_installed` column for #2051, it would be convenient for the the UDTF to provide the ability to filter kelvins out -- they don't have access to kernel headers since they don't have the host filesystem volume mounted. This change introduces an `include_kelvin` init argument to the UDTFs with a default of `true` to preserve the existing behavior. This change also fixes a bug with UDTF's init arg default values, which didn't work prior to this change. Please review commit by commit to see the default arg bug fix followed by the UDTF changes. Relevant Issues: #2051 Type of change: /kind bug Test Plan: New logical planner test no longer fails with the following error ``` $ bazel test -c opt src/carnot/planner:logical_planner_test --test_output=all [ RUN ] LogicalPlannerTest.one_pems_one_kelvin src/carnot/planner/logical_planner_test.cc:64: Failure Value of: IsOK(::px::StatusAdapter(__status_or_value__64)) Actual: false (Invalid Argument : DATA_TYPE_UNKNOWN not handled as a default value) Expected: true ```
#1986 is another great case of the need for this warning/tooling. That was a bug that lasted from August until now and was due to the fact that Amazon linux headers needed to be installed since pixie's pre-packaged headers resulted in broken Go TLS tracing. |
This was referenced Dec 18, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
While Pixie has invested in its prepackaged linux headers and working without upstream headers, it is highly recommended to install the given distro's kernel header package. Upstream distros patch and backport many changes, which make the prepackaged option susceptible issues that are hard to anticipate and work around. Examples of these inconsistencies can be seen in #1863, #252 and the recent openSUSE (2037) and Amazon Linux 2023 (#1986) issues.
Some of these issues get reported, but my suspicion is that this poor experience causes people to fail to fully evaluate Pixie since their initial impression shows that the socket tracer isn't functional (the most common way this problem manifests). For example, the openSUSE case mentioned above was only determined through my outreach and was in a position where the end user had moved on from evaluating Pixie.
If Pixie had the ability to detect when kernel headers aren't installed, we could warn the user that it is recommended to do so and link to common problems caused by the lack of headers. This will provide the end user with quick feedback on an area that's currently arcane to debug and hopefully prevent people from having a poor experience in these cases.
The text was updated successfully, but these errors were encountered: