AMD GPU support #20

haampie · 2021-01-21T14:34:47Z

Adds a hook for AMD GPUs, which currently just mounts /dev/dri and /dev/kfd as advocated by AMD.

Hook can be enabled through the following flag:

sarus run --amdgpu [container] [cmd]

It will just fail when /dev/dri or /dev/kfd does not exist or can't be mounted.

- Bind mount /dev/dri and /dev/kfd in the rootfs - Add the amdgpu hook and install it by default - Enable amdgpu when --amdgpu is passed

Madeeks

Hi @haampie, thanks for opening this PR!

The baseline implementation looks good!
I have a few questions for you:

If I understand correctly, the code is for single GPU systems. What would happen on a multi-GPU system?
The integration with the NVIDIA Container Toolkit does not require any additional CLI option. Is there a feature of the ROCm environment which could be leveraged to obtain a similar experience? If a user requests GPU hardware (and in this case, a specific GPU architecture) through the workload manager, ideally there should not be the need to repeat the request to the container engine.

Madeeks · 2021-01-25T16:08:57Z

etc/CMakeLists.txt

+install(FILES templates/hooks.d/09-slurm-global-sync-hook.json.in DESTINATION ${CMAKE_INSTALL_PREFIX}/etc/hooks.d)
+install(FILES templates/hooks.d/11-amdgpu-hook.json.in DESTINATION ${CMAKE_INSTALL_PREFIX}/etc/hooks.d)


I would not add this hook to a default installation, mainly because it is targeted at very specific hardware, and therefore it should be explicitly chosen by the system administrator (like the MPI and NVIDIA hooks).
Another reason would be that at present we have no way to test it as part of the automated tests.

Madeeks · 2021-01-25T16:38:35Z

src/hooks/amdgpu/AmdGpuHook.cpp

+#include "AmdGpuHook.hpp"
+
+#include <vector>
+#include <fstream>


Are all the headers here needed? For example, I don't think you need fstream, boost/regex, and possibly others.

Madeeks · 2021-01-25T16:40:36Z

src/hooks/amdgpu/AmdGpuHook.hpp

+#define sarus_hooks_amdgpu_AmdGpuHook_hpp
+
+#include <vector>
+#include <unordered_map>


As pointed out for the .cpp file above, could you check if all headers are effectively used?

haampie · 2021-01-25T17:49:52Z

Hi @Madeeks, I haven't tested this for multiple GPUs, but in principle it should work. Every GPU should should be listed in /dev/dri/card{n} for n = 0, 1, ..., and this PR is mounting /dev/dri entirely.

I'll think about autodetection like we have for NVIDIA GPUs, but didn't immediately know what to check. AMD likes to install /opt/rocm/bin/hipconfig to check the version of the rocm libs, but that doesn't imply there are actual GPUs available. Maybe best is to check if vendor data is available from /dev/dri/card* and/or /dev/kfd/*.

haampie · 2021-01-25T18:04:16Z

Ok, so the way rocm_agent_enumerator detects AMD GPUs is by calling hsa_iterate_agents, which is available from a spack package https://github.com/spack/spack/blob/develop/var/spack/repos/builtin/packages/hsa-rocr-dev/package.py, but depends on AMD's fork of LLVM :D so not a great dependency to just add to Sarus.

Another idea is to check if rocminfo is in the PATH or /opt/rocm/bin/rocminfo exists, and if so execute it and grep the output for some string. That's a bit ugly, but probably easiest.

Madeeks · 2021-02-05T21:27:09Z

Let me elaborate a bit more my question about hook interface and device selection.

The CUDA runtime uses the CUDA_VISIBLE_DEVICES environment variable to determine the GPU devices applications have access to. The NVIDIA Container Toolkit uses NVIDIA_VISIBLE_DEVICES to determine which GPUs to mount inside the container. By checking for the presence of such variables, Sarus does not need an explicit CLI option to know if the host process is requesting GPU devices (and which ones).

I was wondering if there were analogous variables in the ROCm environment.
A quick seach brought me to the following issues: ROCm/ROCm#841, ROCm/ROCm#994
From what I understand there are 2 variables which cover similar roles: HIP_VISIBLE_DEVICES and ROCR_VISIBLE_DEVICES.
I don't have experience with ROCm, so according to you can anyone of those be used to control hook activation? If so, which one is the most appropriate? How does the numerical ids in those variables relate to the /dev/dri/* files?

As an additional reference, the GRES plugin of Slurm sets CUDA_VISIBLE_DEVICES to the GPUs allocated by the workload manager. What's the mechanism implemented by Slurm (or other workload managers) to signal allocation of AMD GPUs?

haampie · 2021-02-08T11:22:39Z

Ah, Ault is configured such that by default you get all GPUs.

$ srun -p amdvega /bin/bash -c 'echo "ROCM_VISIBLE_DEVICES: $ROCR_VISIBLE_DEVICES"; /opt/rocm/bin/rocm_agent_enumerator; ls /dev/dri/card*'
ROCM_VISIBLE_DEVICES: 
gfx000
gfx906
gfx906
gfx906
/dev/dri/card0
/dev/dri/card1
/dev/dri/card2
/dev/dri/card3

$ srun -p amdvega --gres=gpu:1 /bin/bash -c 'echo "ROCM_VISIBLE_DEVICES: $ROCR_VISIBLE_DEVICES"; /opt/rocm/bin/rocm_agent_enumerator; ls /dev/dri/card*'
ROCM_VISIBLE_DEVICES: 0
gfx000
gfx906
/dev/dri/card0
/dev/dri/card1
/dev/dri/card2
/dev/dri/card3

$ srun -p amdvega --gres=gpu:3 /bin/bash -c 'echo "ROCM_VISIBLE_DEVICES: $ROCR_VISIBLE_DEVICES"; /opt/rocm/bin/rocm_agent_enumerator; ls /dev/dri/card*'
ROCM_VISIBLE_DEVICES: 0,1,2
gfx000
gfx906
gfx906
gfx906
/dev/dri/card0
/dev/dri/card1
/dev/dri/card2
/dev/dri/card3

$ srun -p amdvega --gres=gpu:2 /bin/bash -c '/opt/rocm/bin/rocminfo | grep GPU'
  Uuid:                    GPU-3f50506172fc1a63               
  Device Type:             GPU                                
  Uuid:                    GPU-3f4478c172fc1a63               
  Device Type:             GPU                                

$ srun -p amdvega --gres=gpu:2 /bin/bash -c '/opt/rocm/opencl/bin/clinfo | grep Number'
Number of platforms:				 1
Number of devices:				 2

haampie · 2021-02-08T11:46:31Z

So, ROCM_VISIBLE_DEVICES is only set by when --gres=gpu[:n] is provided. When it is set, I think it's handled on the software level by the ROCm stack, so we might not want to bother doing the bookkeeping of mounting exactly those specific GPUs from /dev/dri, but leave ROCm to that. For instance:

$ ROCR_VISIBLE_DEVICES=1,2 sarus run -t --mount=type=bind,src=/dev/kfd,dst=/dev/kfd --mount=type=bind,src=/dev/dri,dst=/dev/dri stabbles/sirius-rocm /opt/spack/opt/spack/linux-ubuntu20.04-x86_64/gcc-9.3.0/rocminfo-4.0.0-lruzhymnjm4hez3jeuyf3kyhmjjloqyp/bin/rocm_agent_enumerator
gfx000
gfx906
gfx906

How about we just unconditionally mount /dev/kfd and /dev/dri when they exist?

Edit: in fact I find it only confusing to mount just a few specific GPUs, because ROCR_VISIBLE_DEVICES=1,2 should then be unset or relabeled to ROCR_VISIBLE_DEVICES=0,1 inside the container:

$ ls /dev/dri/
by-path  card0  card1  card2  card3  renderD128  renderD129  renderD130


$ ROCR_VISIBLE_DEVICES=1,2 sarus run \
  --mount=type=bind,src=/dev/kfd,dst=/dev/kfd \
  --mount=type=bind,src=/dev/dri/renderD129,dst=/dev/dri/renderD129 \
  --mount=type=bind,src=/dev/dri/renderD130,dst=/dev/dri/renderD130 \
  stabbles/sirius-rocm /bin/bash -c '/opt/spack/opt/spack/linux-ubuntu20.04-x86_64/gcc-9.3.0/rocminfo-4.0.0-lruzhymnjm4hez3jeuyf3kyhmjjloqyp/bin/rocminfo'
.. only shows 1 gpu because ROCR_VISIBLE_DEVICES is still 1,2 and the GPUs are labeled 0,1 now ...

$ ROCR_VISIBLE_DEVICES=1,2 sarus run \
  --mount=type=bind,src=/dev/kfd,dst=/dev/kfd \
  --mount=type=bind,src=/dev/dri/renderD129,dst=/dev/dri/renderD129 \
  --mount=type=bind,src=/dev/dri/renderD130,dst=/dev/dri/renderD130 \
  stabbles/sirius-rocm /bin/bash -c 'unset ROCR_VISIBLE_DEVICES && /opt/spack/opt/spack/linux-ubuntu20.04-x86_64/gcc-9.3.0/rocminfo-4.0.0-lruzhymnjm4hez3jeuyf3kyhmjjloqyp/bin/rocminfo'
... shows 2 gpus correctly ...

haampie added 2 commits January 21, 2021 15:29

Add very rudimentary support for AMD GPUs

9c4082f

- Bind mount /dev/dri and /dev/kfd in the rootfs - Add the amdgpu hook and install it by default - Enable amdgpu when --amdgpu is passed

Show the source path in error when mounting non-existing path

0fa9c5e

haampie changed the base branch from master to develop January 21, 2021 14:35

Madeeks self-requested a review January 22, 2021 15:34

Madeeks added the enhancement New feature or request label Jan 22, 2021

Madeeks reviewed Jan 25, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AMD GPU support #20

AMD GPU support #20

haampie commented Jan 21, 2021

Madeeks left a comment

Madeeks Jan 25, 2021

Madeeks Jan 25, 2021

Madeeks Jan 25, 2021

haampie commented Jan 25, 2021 •

edited

Loading

haampie commented Jan 25, 2021 •

edited

Loading

Madeeks commented Feb 5, 2021

haampie commented Feb 8, 2021 •

edited

Loading

haampie commented Feb 8, 2021 •

edited

Loading

		install(FILES templates/hooks.d/09-slurm-global-sync-hook.json.in DESTINATION ${CMAKE_INSTALL_PREFIX}/etc/hooks.d)
		install(FILES templates/hooks.d/11-amdgpu-hook.json.in DESTINATION ${CMAKE_INSTALL_PREFIX}/etc/hooks.d)

AMD GPU support #20

Are you sure you want to change the base?

AMD GPU support #20

Conversation

haampie commented Jan 21, 2021

Madeeks left a comment

Choose a reason for hiding this comment

Madeeks Jan 25, 2021

Choose a reason for hiding this comment

Madeeks Jan 25, 2021

Choose a reason for hiding this comment

Madeeks Jan 25, 2021

Choose a reason for hiding this comment

haampie commented Jan 25, 2021 • edited Loading

haampie commented Jan 25, 2021 • edited Loading

Madeeks commented Feb 5, 2021

haampie commented Feb 8, 2021 • edited Loading

haampie commented Feb 8, 2021 • edited Loading

haampie commented Jan 25, 2021 •

edited

Loading

haampie commented Jan 25, 2021 •

edited

Loading

haampie commented Feb 8, 2021 •

edited

Loading

haampie commented Feb 8, 2021 •

edited

Loading