Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run 70b error:RuntimeError: shape '[1, 4096, 64, 128]' is invalid for input of size 4194304 #33

Open
JiaQuan1203 opened this issue Jan 9, 2024 · 2 comments

Comments

@JiaQuan1203
Copy link

After setting up the environment as instructed, I successfully pruned llama2-7b using Wanda without any issues. However, when attempting to prune llama2-70b, the following error occurred:
Traceback (most recent call last):
File "/home/jovyan/projects/BYD/wanda-main/main.py", line 110, in
main()
File "/home/jovyan/projects/BYD/wanda-main/main.py", line 69, in main
prune_wanda(args, model, tokenizer, device, prune_n=prune_n, prune_m=prune_m)
File "/home/jovyan/projects/BYD/wanda-main/lib/prune.py", line 160, in prune_wanda
outs[j] = layer(inps[j].unsqueeze(0), attention_mask=attention_mask, position_ids=position_ids)[0]
File "/opt/conda/envs/prune_llm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
[...]
File "/opt/conda/envs/prune_llm/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 197, in forward
key_states = self.k_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
run command:
python main.py
--model ../weights/llama-2-70b-hf
--prune_method wanda
--sparsity_ratio 0.5
--sparsity_type unstructured
--save ../weights/wanda/
--save_model ../weights/wanda_70b/
Could you please help me understand why this error occurred? Do I need to upgrade the environment, especially the transformer library? Your assistance is appreciated.

@JiaQuan1203
Copy link
Author

After upgrading the transformers library, the previous error disappeared, but a new error has surfaced:
RuntimeError: CUDA out of memory. Tried to allocate 4.00 GiB (GPU 0; 39.39 GiB total capacity; 34.73 GiB already allocated; 2.60 GiB free; 34.76 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

@pprp
Copy link

pprp commented Jan 27, 2024

Same error even after upgrading the transformers library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants