You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have read the README and searched the existing issues.
System Info
[2024-09-13 18:40:14,881] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
WARNING: BNB_CUDA_VERSION=121 environment variable detected; loading libbitsandbytes_cuda121.so.
This can be used to load a bitsandbytes version that is different from the PyTorch CUDA version.
If this was unintended set the BNB_CUDA_VERSION variable to an empty string: export BNB_CUDA_VERSION=
If you use the manual override make sure the right libcudart.so is in your LD_LIBRARY_PATH
For example by adding the following to your .bashrc: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:<path_to_cuda_dir/lib64
[INFO|trainer.py:648] 2024-09-13 18:34:56,525 >> Using auto half precision backend
[WARNING|<string>:213] 2024-09-13 18:34:56,905 >> ==((====))== Unsloth - 2x faster free finetuning | Num GPUs = 1
\\ /| Num examples = 81 | Num Epochs = 3
O^O/ \_/ \ Batch size per device = 1 | Gradient Accumulation steps = 2
\ / Total batch size = 2 | Total steps = 120
"-____-" Number of trainable parameters = 0
0%| | 0/120 [00:00<?, ?it/s]Traceback (most recent call last):
File "/root/miniconda3/envs/unsloth_env/bin/llamafactory-cli", line 8, in <module>
sys.exit(main())
File "/root/LLaMA-Factory/src/llamafactory/cli.py", line 111, in main
run_exp()
File "/root/LLaMA-Factory/src/llamafactory/train/tuner.py", line 50, in run_exp
run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/root/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 96, in run_sft
train_result = trainer.train(resume_from_checkpoint=training_args.resume_from_checkpoint)
File "/root/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/transformers/trainer.py", line 1938, in train
return inner_training_loop(
File "<string>", line 363, in _fast_inner_training_loop
File "/root/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/transformers/trainer.py", line 3349, in training_step
self.accelerator.backward(loss, **kwargs)
File "/root/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/accelerate/accelerator.py", line 2159, in backward
loss.backward(**kwargs)
File "/root/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/torch/_tensor.py", line 521, in backward
torch.autograd.backward(
File "/root/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/torch/autograd/__init__.py", line 289, in backward
_engine_run_backward(
File "/root/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/torch/autograd/graph.py", line 768, in _engine_run_backward
return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
0%| | 0/120 [00:27<?, ?it/s]
Expected behavior
No response
Others
No response
The text was updated successfully, but these errors were encountered:
Reminder
System Info
[2024-09-13 18:40:14,881] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
WARNING: BNB_CUDA_VERSION=121 environment variable detected; loading libbitsandbytes_cuda121.so.
This can be used to load a bitsandbytes version that is different from the PyTorch CUDA version.
If this was unintended set the BNB_CUDA_VERSION variable to an empty string: export BNB_CUDA_VERSION=
If you use the manual override make sure the right libcudart.so is in your LD_LIBRARY_PATH
For example by adding the following to your .bashrc: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:<path_to_cuda_dir/lib64
llamafactory
version: 0.8.4.dev0Reproduction
config file
run training
llamafactory-cli train examples/train_full/gemma2_full_sft.yaml
output
Expected behavior
No response
Others
No response
The text was updated successfully, but these errors were encountered: