You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After setting up the environment as instructed, I successfully pruned llama2-7b using Wanda without any issues. However, when attempting to prune llama2-70b, the following error occurred:
Traceback (most recent call last):
File "/home/jovyan/projects/BYD/wanda-main/main.py", line 110, in
main()
File "/home/jovyan/projects/BYD/wanda-main/main.py", line 69, in main
prune_wanda(args, model, tokenizer, device, prune_n=prune_n, prune_m=prune_m)
File "/home/jovyan/projects/BYD/wanda-main/lib/prune.py", line 160, in prune_wanda
outs[j] = layer(inps[j].unsqueeze(0), attention_mask=attention_mask, position_ids=position_ids)[0]
File "/opt/conda/envs/prune_llm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
[...]
File "/opt/conda/envs/prune_llm/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 197, in forward
key_states = self.k_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
run command:
python main.py
--model ../weights/llama-2-70b-hf
--prune_method wanda
--sparsity_ratio 0.5
--sparsity_type unstructured
--save ../weights/wanda/
--save_model ../weights/wanda_70b/
Could you please help me understand why this error occurred? Do I need to upgrade the environment, especially the transformer library? Your assistance is appreciated.
The text was updated successfully, but these errors were encountered:
After upgrading the transformers library, the previous error disappeared, but a new error has surfaced:
RuntimeError: CUDA out of memory. Tried to allocate 4.00 GiB (GPU 0; 39.39 GiB total capacity; 34.73 GiB already allocated; 2.60 GiB free; 34.76 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
After setting up the environment as instructed, I successfully pruned llama2-7b using Wanda without any issues. However, when attempting to prune llama2-70b, the following error occurred:
Traceback (most recent call last):
File "/home/jovyan/projects/BYD/wanda-main/main.py", line 110, in
main()
File "/home/jovyan/projects/BYD/wanda-main/main.py", line 69, in main
prune_wanda(args, model, tokenizer, device, prune_n=prune_n, prune_m=prune_m)
File "/home/jovyan/projects/BYD/wanda-main/lib/prune.py", line 160, in prune_wanda
outs[j] = layer(inps[j].unsqueeze(0), attention_mask=attention_mask, position_ids=position_ids)[0]
File "/opt/conda/envs/prune_llm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
[...]
File "/opt/conda/envs/prune_llm/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 197, in forward
key_states = self.k_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
run command:
python main.py
--model ../weights/llama-2-70b-hf
--prune_method wanda
--sparsity_ratio 0.5
--sparsity_type unstructured
--save ../weights/wanda/
--save_model ../weights/wanda_70b/
Could you please help me understand why this error occurred? Do I need to upgrade the environment, especially the transformer library? Your assistance is appreciated.
The text was updated successfully, but these errors were encountered: