You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Line 474 in weight_only.py is trying to pad the the tensor so that the first dimension is divisible by the block size. The problem is that instead of passing the shape of the transposed tensor, org_w_shape is used to calculate the number of blocks, which is the shape of the tensor before it gets transposed. This results in the pad_tensor function padding the smaller dimension to be equal to the, instead of padding to a multiple of the block size.
Take the example where you have a tensor with shape [8192, 3072] and a block size of 32:
org_w_shape will be set [8192, 3072]
We transpose the weight tensor on line 473 and its shape becomes [3072, 8192].
We compute the number of blocks on line 474 by doing (org_w_shape[0] + group_size - 1) // 32, which gives us (8192 + 32 - 1) // 32 = 256.
Inside pad_tensor, we calculate padded_rows = k_blocks * group_size = 8192 and pad_len = 8192 - 3072 = 5120
The expected behavior is that the padding would do nothing, but because we use the wrong dimension, we will end up padding the 3072 dimension to be equal to the 8192 dimension, resulting in a tensor fo shape [8192, 8192]. It will then fail later on when we reshape the tensor since the sizes don't match.
The text was updated successfully, but these errors were encountered:
Line 474 in weight_only.py is trying to pad the the tensor so that the first dimension is divisible by the block size. The problem is that instead of passing the shape of the transposed tensor,
org_w_shape
is used to calculate the number of blocks, which is the shape of the tensor before it gets transposed. This results in thepad_tensor
function padding the smaller dimension to be equal to the, instead of padding to a multiple of the block size.Take the example where you have a tensor with shape [8192, 3072] and a block size of 32:
org_w_shape
will be set [8192, 3072]weight
tensor on line 473 and its shape becomes [3072, 8192].(org_w_shape[0] + group_size - 1) // 32
, which gives us(8192 + 32 - 1) // 32 = 256
.pad_tensor
, we calculatepadded_rows = k_blocks * group_size = 8192
andpad_len = 8192 - 3072 = 5120
The expected behavior is that the padding would do nothing, but because we use the wrong dimension, we will end up padding the 3072 dimension to be equal to the 8192 dimension, resulting in a tensor fo shape [8192, 8192]. It will then fail later on when we reshape the tensor since the sizes don't match.
The text was updated successfully, but these errors were encountered: