ttnn.bias_gelu_bw unary low PCC #13856

amalbasaTT · 2024-10-16T08:40:21Z

Describe the bug
ttnn.bias_gelu_bw unary has low PCC when input_tensor_a has bfloat8_b dtype or in random cases when approx is "tanh".

To Reproduce
Steps to reproduce the behavior:
Sweep test for bias_gelu_bw is located in 'tests/sweep_framework/sweeps/eltwise/unary_backward/bias_gelu_bw/bias_gelu_bw.py'

Go to 'tests/sweep_framework/sweeps/eltwise/unary_backward/bias_gelu_bw/bias_gelu_bw.py'
Generate new parameter vectors and run the sweep test

python3 tests/sweep_framework/sweeps_parameter_generator.py --elastic cloud --module-name eltwise.unary_backward.bias_gelu_bw.bias_gelu_bw
python3 tests/sweep_framework/sweeps_runner.py --elastic cloud --module-name eltwise.unary_backward.bias_gelu_bw.bias_gelu_bw --suite-name xfail

See the error. Results can be found on elastic cloud as explained here: https://github.com/tenstorrent/tt-metal/tree/main/tests/sweep_framework

The text was updated successfully, but these errors were encountered:

umadevimcw · 2024-11-12T13:18:09Z

@amalbasaTT Can you provide unit test for this issue as well?

amalbasaTT · 2024-11-12T15:27:07Z

@amalbasaTT Can you provide unit test for this issue as well?
Here is the unit test which confirms low PCC when input_tensor_a has bfloat8_b. First two test cases (where input_tensor_a has bfloat8_b) fail, other two pass.

# SPDX-FileCopyrightText: © 2023 Tenstorrent Inc.

# SPDX-License-Identifier: Apache-2.0

from loguru import logger
from functools import partial
import pytest
import torch
import ttnn
import traceback

from tests.ttnn.utils_for_testing import assert_with_pcc
from tests.tt_eager.python_api_testing.sweep_tests.generation_funcs import gen_func_with_cast_tt
from models.utility_functions import torch_random


def run_backward_div_tests(
    input_shape,
    dtype,
    dlayout,
    in_mem_cfg,
    out_mem_cfg,
    data_seed,
    device,
):
    torch.manual_seed(data_seed)
    # grad tensor
    x = gen_func_with_cast_tt(
        partial(torch_random, low=-100, high=100, dtype=torch.float32), dtype[0]
    )(input_shape[0])
    # input tensor 
    y = gen_func_with_cast_tt(
        partial(torch_random, low=-100, high=100, dtype=torch.float32), dtype[1]
    )(input_shape[0])
    
    y.requires_grad = True
    
    scalar = torch.tensor(1, dtype=torch.bfloat16).uniform_(-100, 100).item()

    try:
        # get ref result
        golden_function = ttnn.get_golden_function(ttnn.bias_gelu_bw)
        ref_value = golden_function(x, y, scalar)[0]

        tt_x = ttnn.from_torch(x, dtype=dtype[0], layout=dlayout[0], device=device, memory_config=in_mem_cfg[0])
        tt_y = ttnn.from_torch(y, dtype=dtype[1], layout=dlayout[0], device=device, memory_config=in_mem_cfg[1])

        tt_result = ttnn.bias_gelu_bw(tt_x, tt_y, scalar, memory_config=out_mem_cfg)[0]
        tt_result = ttnn.to_torch(tt_result)
    
    except Exception as e:
        logger.warning(f"Test execution crashed: {e}")
        print(traceback.format_exc())
        raise e

    assert len(tt_result.shape) == len(ref_value.shape)
    assert tt_result.shape == ref_value.shape
    assert_with_pcc(ref_value, tt_result, 0.999)


test_sweep_args = [
    (
        [(6, 5, 96, 128)],
        [ttnn.bfloat16, ttnn.bfloat8_b],
        [ttnn.TILE_LAYOUT],
        [ttnn.DRAM_MEMORY_CONFIG, ttnn.DRAM_MEMORY_CONFIG, ttnn.DRAM_MEMORY_CONFIG],
        ttnn.DRAM_MEMORY_CONFIG,
        14943539,
    ),
    (
        [(3, 2, 192, 32)],
        [ttnn.bfloat8_b, ttnn.bfloat8_b],
        [ttnn.TILE_LAYOUT],
        [ttnn.DRAM_MEMORY_CONFIG, ttnn.DRAM_MEMORY_CONFIG, ttnn.DRAM_MEMORY_CONFIG],
        ttnn.DRAM_MEMORY_CONFIG,
        14943539,
    ),
    (
        [(3, 2, 192, 32)],
        [ttnn.bfloat8_b, ttnn.bfloat16],
        [ttnn.TILE_LAYOUT],
        [ttnn.DRAM_MEMORY_CONFIG, ttnn.DRAM_MEMORY_CONFIG, ttnn.DRAM_MEMORY_CONFIG],
        ttnn.DRAM_MEMORY_CONFIG,
        14943539,
    ),
    (
        [(3, 2, 192, 32)],
        [ttnn.bfloat16, ttnn.bfloat16],
        [ttnn.TILE_LAYOUT],
        [ttnn.DRAM_MEMORY_CONFIG, ttnn.DRAM_MEMORY_CONFIG, ttnn.DRAM_MEMORY_CONFIG],
        ttnn.DRAM_MEMORY_CONFIG,
        14943539,
    ),
]


@pytest.mark.parametrize(
    "input_shape, dtype, dlayout, in_mem_config, out_mem_config, data_seed",
    (test_sweep_args),
)
def test_backward_div(input_shape, dtype, dlayout, in_mem_config, out_mem_config, data_seed, device):
    run_backward_div_tests(input_shape, dtype, dlayout, in_mem_config, out_mem_config, data_seed, device)

amalbasaTT · 2024-11-12T15:41:22Z

Here is a unit test for the low PCC for some of the cases where approx was "tanh" (most of those are were grad tensor has bfloat8_b , but there was an instance where grad tensor had bfloat16 also):

# SPDX-FileCopyrightText: © 2023 Tenstorrent Inc.

# SPDX-License-Identifier: Apache-2.0

from loguru import logger
from functools import partial
import pytest
import torch
import ttnn
import traceback

from tests.ttnn.utils_for_testing import assert_with_pcc
from tests.tt_eager.python_api_testing.sweep_tests.generation_funcs import gen_func_with_cast_tt
from models.utility_functions import torch_random


def run_backward_div_tests(
    input_shape,
    approx,
    dtype,
    dlayout,
    in_mem_cfg,
    out_mem_cfg,
    data_seed,
    device,
):
    torch.manual_seed(data_seed)
    # grad tensor
    x = gen_func_with_cast_tt(
        partial(torch_random, low=-100, high=100, dtype=torch.float32), dtype[0]
    )(input_shape[0])
    # input tensor 
    y = gen_func_with_cast_tt(
        partial(torch_random, low=-100, high=100, dtype=torch.float32), dtype[1]
    )(input_shape[0])
    
    y.requires_grad = True
    
    scalar = torch.tensor(1, dtype=torch.bfloat16).uniform_(-100, 100).item()

    try:
        # get ref result
        golden_function = ttnn.get_golden_function(ttnn.bias_gelu_bw)
        ref_value = golden_function(x, y, scalar, value=approx)[0]

        tt_x = ttnn.from_torch(x, dtype=dtype[0], layout=dlayout[0], device=device, memory_config=in_mem_cfg[0])
        tt_y = ttnn.from_torch(y, dtype=dtype[1], layout=dlayout[0], device=device, memory_config=in_mem_cfg[1])

        tt_result = ttnn.bias_gelu_bw(tt_x, tt_y, scalar, approximate=approx, memory_config=out_mem_cfg)[0]
        tt_result = ttnn.to_torch(tt_result)
    
    except Exception as e:
        logger.warning(f"Test execution crashed: {e}")
        print(traceback.format_exc())
        raise e

    assert len(tt_result.shape) == len(ref_value.shape)
    assert tt_result.shape == ref_value.shape
    assert_with_pcc(ref_value, tt_result, 0.999)


test_sweep_args = [
    (
        [(6, 10, 128, 224)],
        "tanh",
        [ttnn.bfloat8_b, ttnn.bfloat16],
        [ttnn.TILE_LAYOUT],
        [ttnn.L1_MEMORY_CONFIG, ttnn.DRAM_MEMORY_CONFIG],
        ttnn.DRAM_MEMORY_CONFIG,
        14469376,
    ),
    (
        [(4, 2, 96, 192)],
        "tanh",
        [ttnn.bfloat16, ttnn.bfloat16],
        [ttnn.TILE_LAYOUT],
        [ttnn.L1_MEMORY_CONFIG, ttnn.DRAM_MEMORY_CONFIG],
        ttnn.L1_MEMORY_CONFIG,
        4378657,
    ),
    (
        [(5, 10, 224, 32)],
        "tanh",
        [ttnn.bfloat8_b, ttnn.bfloat16],
        [ttnn.TILE_LAYOUT],
        [ttnn.DRAM_MEMORY_CONFIG, ttnn.DRAM_MEMORY_CONFIG],
        ttnn.DRAM_MEMORY_CONFIG,
        678741,
    ),
    (
        [(97, 129)],
        "tanh",
        [ttnn.bfloat16, ttnn.bfloat16],
        [ttnn.TILE_LAYOUT],
        [ttnn.DRAM_MEMORY_CONFIG, ttnn.L1_MEMORY_CONFIG],
        ttnn.DRAM_MEMORY_CONFIG,
        7580522,
    ),
]


@pytest.mark.parametrize(
    "input_shape, approx, dtype, dlayout, in_mem_config, out_mem_config, data_seed",
    (test_sweep_args),
)
def test_backward_div(input_shape, approx, dtype, dlayout, in_mem_config, out_mem_config, data_seed, device):
    run_backward_div_tests(input_shape, approx, dtype, dlayout, in_mem_config, out_mem_config, data_seed, device)

KalaivaniMCW · 2024-11-14T15:15:36Z

bias_gelu_bw unary is directly dependent on gelu_bw which involves multiple eltwise ops.

On debugging step-by-step for approx = "tanh", the pcc drop occurs with ttnn.tanh as it has a pcc of 0.994 here which further degrades when we cannot get -0.0 on our operations

Analysis sheet : link

amalbasaTT added bug Something isn't working GS WH op_cat: eltwise backward labels Oct 16, 2024

eyonland mentioned this issue Oct 18, 2024

Eltwise Master Tracking #13795

Open

KalaivaniMCW self-assigned this Nov 13, 2024

KalaivaniMCW added a commit that referenced this issue Nov 14, 2024

#13856: ttnn.bias_gelu_bw unary has low PCC

2125652

KalaivaniMCW mentioned this issue Nov 14, 2024

#13856: ttnn.bias_gelu_bw unary has low PCC #15037

Draft

5 tasks

KalaivaniMCW added a commit that referenced this issue Nov 14, 2024

#13856: ttnn.bias_gelu_bw unary has low PCC

bcd98a4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ttnn.bias_gelu_bw unary low PCC #13856

ttnn.bias_gelu_bw unary low PCC #13856

amalbasaTT commented Oct 16, 2024 •

edited

Loading

umadevimcw commented Nov 12, 2024

amalbasaTT commented Nov 12, 2024

amalbasaTT commented Nov 12, 2024

KalaivaniMCW commented Nov 14, 2024 •

edited

Loading

ttnn.bias_gelu_bw unary low PCC #13856

ttnn.bias_gelu_bw unary low PCC #13856

Comments

amalbasaTT commented Oct 16, 2024 • edited Loading

umadevimcw commented Nov 12, 2024

amalbasaTT commented Nov 12, 2024

amalbasaTT commented Nov 12, 2024

KalaivaniMCW commented Nov 14, 2024 • edited Loading

amalbasaTT commented Oct 16, 2024 •

edited

Loading

KalaivaniMCW commented Nov 14, 2024 •

edited

Loading