-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there an ability to automatically assign vf with GPU affinity to pods? #736
Comments
that is a kubernetes feature. you can configure device manager and check the topology type https://kubernetes.io/docs/tasks/administer-cluster/topology-manager/#policy-single-numa-node |
Thanks for your reply, I think even if GPU and Nic are in the same NUMA nodes, they may still cross the PCIe bridge, as shown in the figure above, GPU0 and mlx5_3, so in this case, we cannot enable GPU Direct RDMA. The same NUMA nodes may be a large distance, we may need a smaller distance. |
currently there is no solution that im aware of which takes into account PCIe topology. DRA (Dynamic Resource Allocation) aims to solve that, but there is still a way to go.... |
This is on DRA roadmap as @adrianchiris mentions, it will be beta in 1.32 |
Hi @cyclinder, do you think we can close this issue? I think for now the only solution is to manually have a resourcePool for every pcie and do a static request for GPU pool and VF pool from the same pcie :( |
@SchSeba Yes we can close the issue, we are planning to implement this in these days. |
If the gpu and nic are on the same PCIe bridge or their topology distance is at least
PHB
, then communication between them can be accelerated by enabling GPU Direct RDMA.The text was updated successfully, but these errors were encountered: