XPU backend is available in PyTorch starting from PyTorch 2.4 and requires Intel GPU to run. However, not all Intel GPUs are supported. This article reviews hardware support story for PyTorch XPU backend from different angles. We will review what hardware is supported by Intel and duscuss important implementation details affecting hardware support.
Note
List of supported GPUs requires clarification. See pytorch/pytorch#138347 for details. Doing the best effort here to deduce list of GPUs with links to Intel registry descriptions at https://ark.intel.com.
Intel GPUs supported for PyTorch XPU backend are documented by Intel per PyTorch release:
Not all GPUs are declared for support by all Operating Systems:
- For Linux (Ubuntu, RHEL, SUSE):
2.4 |
2.5 |
|
---|---|---|
Server GPUs: | ||
Intel® Data Center GPU Max Series | ✓ | ✓ |
Discrete Client GPUs: | ||
Intel® Arc™ A-Series Graphics | ✓ |
- For Windows (10 and 11, including WSL2, support available starting from PyTorch 2.5):
2.5 |
|
---|---|
Discrete Client GPUs: | |
Intel® Arc™ A-Series Graphics | ✓ |
Integrated Client GPUs: | |
Meteor Lake | ✓ |
PyTorch XPU implements eager mode Aten operators in a few different ways:
- Most operators are implemented as SYCL kernels
- Convolution and GEMM operators are implemented via oneDNN library
oneDNN supports wide range of Intel GPUs. Check with its documentaiton for details (roughly - starting from Tiger Lake). SYCL kernels however are generated (by default) for the smaller range of select GPUs which is also made OS dependent. The list of GPUs can be found:
Below tables provide summary on the device types for which SYCL kernels are pre-built in PyTorch XPU:
- For Linux:
Device Type | 2.4 |
2.5 |
---|---|---|
ats-m150 |
✓ | |
pvc |
✓ | ✓ |
xe-lpg |
✓ | ✓ |
- For Windows:
Device Type | 2.5 |
---|---|
ats-m150 |
✓ |
mtl-h |
✓ |
mtl-u |
✓ |
lnl-m |
✓ |
Device types in the tables above are given as they appear in the PyTorch XPU sources. They correspond to device types accepted by ocloc
compiler with -device <device_type>
argument. Full list of accepted types depends on the installed version of Intel GPU drivers stack and can be queried by running ocloc compile --help
.
Note that it's possible to adjust list of devices for which SYCL kernels will be compiled during PyTorch XPU build process. For example, the following build command line corresponds to default PyTorch 2.5 XPU build setting on Linux:
TORCH_XPU_ARCH_LIST=pvc,xe-lpg,ats-m150 python3 setup.py develop
Building for a lesser number of GPUs will reduce build time and footprint. This also can be used to attempt a build for a different Intel GPUs. Note however that such a build might not succeed or generated binaries might fail at runtime.
One important aspect of how XPU backend in PyTorch works is potential kernels recompilation at runtime. It is triggered by underlying driver stack if pre-built eager mode kenrels are not available for the current GPU. Such recompilation is time consuming and will cause higher latency. It also might fail at compile time or further during kernel execution on device if kernel implementation is incompatible with this particular GPU.
At the moment PyTorch XPU has no built-in checks to verify current GPU against supported list (see pytorch/pytorch#131799). Thus, if underlying driver stack will report that it's available for the GPU, then PyTorch XPU will try to use such device potentially causing runtime kernels recompilation.