You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We encountered a shape mismatch error while trying to reproduce Duo Attention. We tested versions of transformers from 4.37 to 4.47, and the issue shifted from a RuntimeError: Boolean value of Tensor with more than one value is ambiguous to a RuntimeError: shape '[1, 3098, 6, 5, 128]' is invalid for input of size 12689408. We couldn't resolve the issue by changing the versions.
We also tried different models with the following commands:
However, none of these models worked. There was a previous issue suggesting that updating the transformer version could solve the problem, but we are still getting shape mismatch errors.
Could there be other packages that need to be updated as well?
The text was updated successfully, but these errors were encountered:
We encountered a shape mismatch error while trying to reproduce Duo Attention. We tested versions of transformers from 4.37 to 4.47, and the issue shifted from a RuntimeError: Boolean value of Tensor with more than one value is ambiguous to a RuntimeError: shape '[1, 3098, 6, 5, 128]' is invalid for input of size 12689408. We couldn't resolve the issue by changing the versions.
We also tried different models with the following commands:
huggingface-cli download togethercomputer/Llama-2-7B-32K-Instruct --local-dir Llama-2-7B-32K-Instruct
huggingface-cli download gradientai/Llama-3-8B-Instruct-Gradient-1048k --local-dir Llama-3-8B-Instruct-Gradient-1048k
huggingface-cli download gradientai/Llama-3-8B-Instruct-Gradient-4194k --local-dir Llama-3-8B-Instruct-Gradient-4194k
huggingface-cli download mistralai/Mistral-7B-Instruct-v0.2 --local-dir Mistral-7B-Instruct-v0.2
huggingface-cli download mistralai/Mistral-7B-Instruct-v0.3 --local-dir Mistral-7B-Instruct-v0.3
However, none of these models worked. There was a previous issue suggesting that updating the transformer version could solve the problem, but we are still getting shape mismatch errors.
Could there be other packages that need to be updated as well?
The text was updated successfully, but these errors were encountered: