tensor-cores

Star

Here are 6 public repositories matching this topic...

DefTruth / hgemm-tensorcores-mma

Star

⚡️Write HGEMM from scratch using Tensor Cores with WMMA, MMA and CuTe API, achieve peak⚡️ performance

cuda tensor-cores hgemm

Updated Jan 13, 2025
Cuda

DefTruth / ffpa-attn-mma

Star

📚[WIP] FFPA: Yet antother Faster Flash Prefill Attention with O(1)🎉GPU SRAM complexity for headdim > 256, 1.5x~2x🎉faster vs SDPA EA.

cuda attention sdpa mlsys tensor-cores flash-attention

Updated Jan 13, 2025
Cuda

tgautam03 / tGeMM

Star

General Matrix Multiplication using NVIDIA Tensor Cores

matrix-multiplication cuda-kernels gpu-computing nvidia-cuda nvidia-gpu gpu-programming sgemm cuda-programming tensor-cores nvidia-tensor-cores

Updated Oct 25, 2024
Cuda

LDRyan0 / Correlator-Bench

Star

A benchmarking framework for correlators of FX telescope arrays

cpp cuda radio-astronomy astronomy-instrumentation tensor-cores

Updated Oct 20, 2023
Cuda

etasnadi / VulkanCooperativeMatrixAttention

Star

Vulkan & GLSL implementation of FlashAttention-2

vulkan glsl artificial-intelligence gpu-acceleration attention gpu-computing deel-learning tensor-cores large-language-models llm flash-attention flash-attention-2

Updated Jan 12, 2025
C++

8e8bdba457c18cf692a95fe2ec67000b / VulkanCooperativeMatrixAttention

Star

Vulkan & GLSL implementation of FlashAttention-2

vulkan glsl artificial-intelligence gpu-acceleration attention gpu-computing deel-learning tensor-cores large-language-models llm flash-attention flash-attention-2

Updated Jan 13, 2025

Improve this page

Add a description, image, and links to the tensor-cores topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the tensor-cores topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tensor-cores

Here are 6 public repositories matching this topic...

DefTruth / hgemm-tensorcores-mma

DefTruth / ffpa-attn-mma

tgautam03 / tGeMM

LDRyan0 / Correlator-Bench

etasnadi / VulkanCooperativeMatrixAttention

8e8bdba457c18cf692a95fe2ec67000b / VulkanCooperativeMatrixAttention

Improve this page

Add this topic to your repo