Is there an ability to automatically assign vf with GPU affinity to pods? #736

cyclinder · 2024-07-16T07:04:56Z

If the gpu and nic are on the same PCIe bridge or their topology distance is at least PHB, then communication between them can be accelerated by enabling GPU Direct RDMA.

The text was updated successfully, but these errors were encountered:

SchSeba · 2024-07-17T09:33:43Z

that is a kubernetes feature. you can configure device manager and check the topology type

https://kubernetes.io/docs/tasks/administer-cluster/topology-manager/#policy-single-numa-node

cyclinder · 2024-07-18T13:14:19Z

Thanks for your reply, I think even if GPU and Nic are in the same NUMA nodes, they may still cross the PCIe bridge, as shown in the figure above, GPU0 and mlx5_3, so in this case, we cannot enable GPU Direct RDMA. The same NUMA nodes may be a large distance, we may need a smaller distance.

adrianchiris · 2024-07-18T15:41:49Z

currently there is no solution that im aware of which takes into account PCIe topology.

DRA (Dynamic Resource Allocation) aims to solve that, but there is still a way to go....

aojea · 2024-11-15T04:14:14Z

This is on DRA roadmap as @adrianchiris mentions, it will be beta in 1.32

SchSeba · 2024-12-11T14:08:26Z

Hi @cyclinder, do you think we can close this issue?
I don't think there is something the sriov-operator can do about this.

I think for now the only solution is to manually have a resourcePool for every pcie and do a static request for GPU pool and VF pool from the same pcie :(

cyclinder · 2024-12-12T01:51:43Z

@SchSeba Yes we can close the issue, we are planning to implement this in these days.

cyclinder closed this as completed Dec 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there an ability to automatically assign vf with GPU affinity to pods? #736

Is there an ability to automatically assign vf with GPU affinity to pods? #736

cyclinder commented Jul 16, 2024

SchSeba commented Jul 17, 2024

cyclinder commented Jul 18, 2024

adrianchiris commented Jul 18, 2024

aojea commented Nov 15, 2024

SchSeba commented Dec 11, 2024

cyclinder commented Dec 12, 2024

Is there an ability to automatically assign vf with GPU affinity to pods? #736

Is there an ability to automatically assign vf with GPU affinity to pods? #736

Comments

cyclinder commented Jul 16, 2024

SchSeba commented Jul 17, 2024

cyclinder commented Jul 18, 2024

adrianchiris commented Jul 18, 2024

aojea commented Nov 15, 2024

SchSeba commented Dec 11, 2024

cyclinder commented Dec 12, 2024