-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conv Layer Incorrect output #14236
Comments
@shwetankTT you opened a P0 bug and assigned it to yourself? |
Yeah It was def not a P0. Thanks for changing it to P1. |
Seems like a hardware limitation? Let's take an example where input and output is 8x320x16x16 distributed across 64 cores, resulting in each core having a shard shape of (32, 320). This equates to 1 tile row and 10 tile columns. Since the output is in row-major order, completing the each row requires data from all 10 tiles, but we only have 8 dst tiles. @mywoodstock |
Thanks @shwetankTT , yes if the width is > 8 tiles, we will need to iterate over the 8-tile blocks. Can you add a TT_FATAL for this for now? |
Hi guys, so I see that pack_untilize has been tested in the unit level with Destination register Float32, and output format as Float32 https://github.com/tenstorrent/tt-metal/blob/main/tests/tt_metal/tt_metal/unit_tests/compute/test_untilize_tilize.cpp But it seems the above use case is that Input is to the copy + pack_untilize is Float16_b, then input to the packer is Float32, and output of the packer is possibly back to Float16_b? We can add a unit test case that replicates the behavior you see to verify if it is a kernel issue, in the meantime @mywoodstock can you make sure the data format reconfigs are being correctly used with the copy + pack_untilize above? @ncvetkovicTT @nvelickovicTT The issue above is that once Float32 destination register + Float32 packer input format is set, the results start being incorrect, and the pattern looks like each row of the Float32 destination register (failing result) is half of the datums of the Float16_b register (passing result). if possible, it would be good to add their specific test to our unit test infra. |
Will add more details.
The text was updated successfully, but these errors were encountered: