[Kernel] Update cutlass_scaled_mm
to support 2d group (blockwise) scaling
#31820
Job | Run time |
---|---|
17s | |
17s |
cutlass_scaled_mm
to support 2d group (blockwise) scaling
#31820
Job | Run time |
---|---|
17s | |
17s |