Skip to content

Commit

Permalink
Apply suggestions from code review
Browse files Browse the repository at this point in the history
Co-authored-by: Alejandro Acosta <[email protected]>
  • Loading branch information
AD2605 and aacostadiaz authored Oct 16, 2024
1 parent 3bde4d2 commit abf0778
Show file tree
Hide file tree
Showing 3 changed files with 4 additions and 4 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ and improves code composability and readability. More documentation specific to
In addition to GEMMs, CUTLASS implements high-performance convolution via the implicit GEMM algorithm. Implicit GEMM is the formulation of a convolution operation as a GEMM thereby taking advantage of CUTLASS's modular GEMM pipeline. This allows CUTLASS to build convolutions by reusing highly-optimized GEMM components.

## CUTLASS with SYCL
CUTLASS 3.0 API now also supports SYCL, and can run on Nvidia(upto the Ampere architecture) and Intel Xe Core architecture GPUs using the SYCL backend using the Intel open source `DPC++` compiler.
CUTLASS 3.0 API now also supports SYCL, and can run on Nvidia(upto the Ampere architecture) and Intel PVC GPUs using the SYCL backend using the Intel open source `DPC++` compiler.
The support is currently limited to GEMMs only. See [Quick Start Guide](./media/docs/build/building_with_sycl_support.md) on how to build and run
examples using the SYCL backend.

Expand Down
3 changes: 1 addition & 2 deletions media/docs/build/building_with_sycl_support.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,11 +68,10 @@ CUTLASS Examples <br>
* Example 14
* We also provide various SYCL examples for the Intel Data Center Max range of GPUs

## SYCL Supported Architectures and APIs
## SYCL Supported Architectures
At the time of writing, the SYCL backend supports all Nvidia architectures till Ampere, and the
Intel Data Center Max series of GPUs is supported.

We support the `CollectiveMMA` and the collective builder APIs for the same.

# References

Expand Down
3 changes: 2 additions & 1 deletion media/docs/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,8 @@ $ mkdir build && cd build

$ cmake -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_COMPILER=clang -DCUTLASS_ENABLE_SYCL=ON -DDPCPP_SYCL_TARGET=nvptx64-nvidia-cuda -DDPCPP_SYCL_ARCH=sm_80 .. # compiles for the NVIDIA Ampere GPU architecture

$ cmake -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_COMPILER=clang -DCUTLASS_ENABLE_SYCL=ON -DDPCPP_SYCL_TARGET=intel_gpu_pvc .. # compiles for the Intel Xe Core Architecture
# compiles for the Intel PVC Architecture
cmake -DCUTLASS_ENABLE_SYCL=ON -DDPCPP_SYCL_TARGET=intel_gpu_pvc ..
```
A complete example can be as follows (running on the Intel Data Center Max 1100) -

Expand Down

0 comments on commit abf0778

Please sign in to comment.