Round-to-zero float32 conversion not supported #58

int3 · 2024-07-19T16:53:38Z

Small repro case:

import triton
import triton.language as tl
import torch


@triton.jit
def type_convert_triton(src, dst, rounding: tl.constexpr):
    range = tl.arange(0, 128)
    x = tl.load(src + range)
    y = x.to(dst.dtype.element_ty, fp_downcast_rounding=rounding)
    tl.store(dst + range, y)


src = torch.zeros([128], dtype=torch.float32)
dst = torch.empty(src.shape, dtype=torch.float16)
type_convert_triton[(1, )](src, dst, rounding='rtz')

The text was updated successfully, but these errors were encountered:

ienkovich · 2024-07-19T20:14:30Z

I tried to enable this one some time ago and I thought I hit a LLVM bug but it appeared that arith::TruncFOp lowering doesn't guarantee us a proper rounding mode. It generates llvm.experimental.constrained.fptrunc but this intrinsic doesn't control rounding, it only hints the compiler about runtime rounding settings to be used: llvm/llvm-project#96815
I think we should lower it directly to vcvtps2ph intrinsic calls with explicit rounding to make it work.

int3 mentioned this issue Jul 19, 2024

Enable more tests for CPU in CI #51

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Round-to-zero float32 conversion not supported #58

Round-to-zero float32 conversion not supported #58

int3 commented Jul 19, 2024

ienkovich commented Jul 19, 2024

Round-to-zero float32 conversion not supported #58

Round-to-zero float32 conversion not supported #58

Comments

int3 commented Jul 19, 2024

ienkovich commented Jul 19, 2024