UCT/CUDA_IPC: Possible UB when enabling CUDA_IPC_CACHE #10346

andylin-hao · 2024-12-02T15:07:16Z

Hi,

I was testing the cuda_ipc UCT (v1.8.0), and noticed that the cuda_ipc UCT caches an opened cuIpcMemHandle by default since the UCX_CUDA_IPC_CACHE option is defaulted to y. Although this didn't cause apparent problems in my test environment, I'm worried about potential undefined behaviors due to this implementation choice. CUDA documentation dictates that Calling cuMemFree on an exported memory region before calling cuIpcCloseMemHandle in the importing context will result in undefined behavior (https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__MEM.html#group__CUDA__MEM_1ga8bd126fcff919a0c996b7640f197b79). By instrumenting the code I can confirm that CUDA memory allocated by UCX users is indeed freed before the IPC handle due to the caching mechanism when running the ucc_perftest's allgather/allreduce tests with two processes and cuda_ipc UCT. The perftest's allgather/allreduce pattern involves allocating a buffer, sending the buffer data to neighbor via cuda_ipc, and freeing the buffer in a loop. After the first loop, the opened handles are not closed while the allocated buffers are.

Since this hasn't caused any serious problems, I'm not marking it as a bug. But I think it'd be better to report this anyway for references and I'd like to hear your opinions on this.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UCT/CUDA_IPC: Possible UB when enabling CUDA_IPC_CACHE #10346

UCT/CUDA_IPC: Possible UB when enabling CUDA_IPC_CACHE #10346

andylin-hao commented Dec 2, 2024

UCT/CUDA_IPC: Possible UB when enabling CUDA_IPC_CACHE #10346

UCT/CUDA_IPC: Possible UB when enabling CUDA_IPC_CACHE #10346

Comments

andylin-hao commented Dec 2, 2024