Skip to content
This repository has been archived by the owner on Feb 7, 2024. It is now read-only.

Retracing vmcores in Podman fails #423

Open
mgrabovsky opened this issue Apr 27, 2021 · 4 comments
Open

Retracing vmcores in Podman fails #423

mgrabovsky opened this issue Apr 27, 2021 · 4 comments

Comments

@mgrabovsky
Copy link
Contributor

Interactively retracing vmcores in Podman fails with the message

crash: /cores/retrace/repos/kernel/x86_64/usr/lib/debug/lib/modules/4.18.0-80.1.2.el8_0.x86_64/vmlinux: No such file or directory

Usage:

  crash [OPTION]... NAMELIST MEMORY-IMAGE[@ADDRESS]	(dumpfile form)
  crash [OPTION]... [NAMELIST]             		(live system form)

Enter "crash -h" for details.

Reported by @DaveWysochanskiRH.

@mgrabovsky mgrabovsky added the bug label Apr 27, 2021
@mgrabovsky mgrabovsky added this to the 2.0.0 milestone Jan 17, 2022
@DaveWysochanskiRH
Copy link
Collaborator

DaveWysochanskiRH commented Oct 11, 2022

There are a number of issues. I tried this again and ran into other issues, I no longer see the above error with vmlinux though.

  1. Need rootless podman. This seems to be somewhat a mess but I think this works with latest upstream.

  2. Copying vmcore into container is a problem due to size of vmcores (often 100GB or more).

  3. Need both local users and ldap users to be able to use podman rootless. There's some issues with this depending on which version but I think this is fixed in upstream (see https://bugzilla.redhat.com/show_bug.cgi?id=2092629 and related bugs such as https://bugzilla.redhat.com/show_bug.cgi?id=2063750 and https://bugzilla.redhat.com/show_bug.cgi?id=2068088)

  4. Container storage should be setup for non-NFS use (see /etc/containers/storage.conf

  • Temporary Fix: Changed "graphroot" and "rootless_storage_path" variables to point at a local filesystem directory, manually fixed permissions, with "rootless_storage_path" defined with "$USER" directory path component
  1. Issues with 'AuthGroup' where tasks would fail with the following error in the retrace_log
[2022-10-11 10:50:01] [E] Task failed: Unable to build podman container: time="2022-10-11T10:50:01-04:00" level=error msg="running `/usr/bin/newuidmap 1155992 0 174 1 1 231072 65536`: newuidmap: Target process 123456 is owned by a different user: uid:111 pw_uid:111 st_uid:111, gid:5555 pw_gid:111 st_gid:5555\n"
  • Temporary fix: Change "AuthGroup" value from an LDAP group back to local group "retrace" (uid/gid == 111/111)
  1. Could not find base container image to build the container
  • Temporary fix: This patch fixed it for me:
@@ -922,7 +922,7 @@ class RetraceWorker:

             try:
                 with (savedir / RetraceTask.CONTAINERFILE).open("w") as cntfile:
-                    cntfile.write(f"FROM {distribution}:{version}\n\n")
+                    cntfile.write(f"FROM ubi{version}/ubi\n\n")
                     cntfile.write("RUN dnf "
                                   f"--releasever={version} "
                                   "--assumeyes "
  1. Could not obtain kernel-debuginfo package
  • This patched fixed it for me:
@@ -931,7 +931,7 @@ class RetraceWorker:
                                   "shadow-utils && dnf clean all\n")
                     cntfile.write("RUN dnf "
                                   "--assumeyes "
-                                  "--enablerepo=*debuginfo* "
+                                  f"--enablerepo={distribution}-{version}-for-$(uname -m)-baseos-debug-rpms "
                                   "install kernel-debuginfo\n\n")
                     cntfile.write("RUN useradd --no-create-home --no-log-init retrace\n")
                     cntfile.write("RUN mkdir --parents /var/spool/abrt/crash\n\n")

After all that I still get this in the log:

[2022-10-13 04:19:04] [E] time="2022-10-13T04:19:04-04:00" level=warning msg="The input device is not a TTY. The --tty and --interactive flags might not work properly"

@DaveWysochanskiRH
Copy link
Collaborator

DaveWysochanskiRH commented Oct 17, 2022

  1. Copying vmcore into container is a problem due to size of vmcores (often 100GB or more).

I don't think we need to copy the vmcore but we can use "-v" to bind mount in the vmcore and vmlinux files and any needed paths.

@DaveWysochanskiRH
Copy link
Collaborator

  1. Issues with 'AuthGroup' where tasks would fail with the following error in the retrace_log
    [2022-10-11 10:50:01] [E] Task failed: Unable to build podman container: time="2022-10-11T10:50:01-04:00" level=error msg="running /usr/bin/newuidmap 1155992 0 174 1 1 231072 65536: newuidmap: Target process 123456 is owned by a different user: uid:111 pw_uid:111 st_uid:111, gid:5555 pw_gid:111 st_gid:5555\n"

I had AuthGroup set in /etc/retrace-server/retrace-server.conf so that is why I got the above error. I needed to update the 'retrace' group in /etc/password as follows and this fixed the above. I wonder if that should be a standard procedure for installs when AuthGroup is used?

# usermod -g my-auth-group retrace
# systemctl restart httpd

@DaveWysochanskiRH
Copy link
Collaborator

  1. Issues with 'AuthGroup' where tasks would fail with the following error in the retrace_log
    [2022-10-11 10:50:01] [E] Task failed: Unable to build podman container: time="2022-10-11T10:50:01-04:00" level=error msg="running /usr/bin/newuidmap 1155992 0 174 1 1 231072 65536: newuidmap: Target process 123456 is owned by a different user: uid:111 pw_uid:111 st_uid:111, gid:5555 pw_gid:111 st_gid:5555\n"

I had AuthGroup set in /etc/retrace-server/retrace-server.conf so that is why I got the above error. I needed to update the 'retrace' group in /etc/password as follows and this fixed the above. I wonder if that should be a standard procedure for installs when AuthGroup is used?

# usermod -g my-auth-group retrace
# systemctl restart httpd

After the above I'm getting the following error:

[2022-10-17 13:11:52] [E] Task failed: Unable to build podman container: Error: failed to mount overlay for metacopy check with "" options: permission denied

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants