run 70b error:RuntimeError: shape '[1, 4096, 64, 128]' is invalid for input of size 4194304 #33

JiaQuan1203 · 2024-01-09T05:45:13Z

After setting up the environment as instructed, I successfully pruned llama2-7b using Wanda without any issues. However, when attempting to prune llama2-70b, the following error occurred:
Traceback (most recent call last):
File "/home/jovyan/projects/BYD/wanda-main/main.py", line 110, in
main()
File "/home/jovyan/projects/BYD/wanda-main/main.py", line 69, in main
prune_wanda(args, model, tokenizer, device, prune_n=prune_n, prune_m=prune_m)
File "/home/jovyan/projects/BYD/wanda-main/lib/prune.py", line 160, in prune_wanda
outs[j] = layer(inps[j].unsqueeze(0), attention_mask=attention_mask, position_ids=position_ids)[0]
File "/opt/conda/envs/prune_llm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
[...]
File "/opt/conda/envs/prune_llm/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 197, in forward
key_states = self.k_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
run command：
python main.py
--model ../weights/llama-2-70b-hf
--prune_method wanda
--sparsity_ratio 0.5
--sparsity_type unstructured
--save ../weights/wanda/
--save_model ../weights/wanda_70b/
Could you please help me understand why this error occurred? Do I need to upgrade the environment, especially the transformer library? Your assistance is appreciated.

JiaQuan1203 · 2024-01-09T06:13:52Z

After upgrading the transformers library, the previous error disappeared, but a new error has surfaced：
RuntimeError: CUDA out of memory. Tried to allocate 4.00 GiB (GPU 0; 39.39 GiB total capacity; 34.73 GiB already allocated; 2.60 GiB free; 34.76 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

pprp · 2024-01-27T03:29:50Z

Same error even after upgrading the transformers library.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

run 70b error:RuntimeError: shape '[1, 4096, 64, 128]' is invalid for input of size 4194304 #33

run 70b error:RuntimeError: shape '[1, 4096, 64, 128]' is invalid for input of size 4194304 #33

JiaQuan1203 commented Jan 9, 2024

JiaQuan1203 commented Jan 9, 2024

pprp commented Jan 27, 2024

run 70b error:RuntimeError: shape '[1, 4096, 64, 128]' is invalid for input of size 4194304 #33

run 70b error:RuntimeError: shape '[1, 4096, 64, 128]' is invalid for input of size 4194304 #33

Comments

JiaQuan1203 commented Jan 9, 2024

JiaQuan1203 commented Jan 9, 2024

pprp commented Jan 27, 2024