-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
torch_extensions/py38_cu113/fused_adam/fused_adam.so: cannot open shared object file #196
Comments
This worked for me today:
|
Thanks for your prompt response. I ran the above code but received this error message:
Can you let me know how to fix this? By the way, what command are you using to install the python packages needed for mistral? Are you using Thanks. |
I updated the branch to fix those configuration issues. And yes I think I did that pip as well and forgot that from the install. |
I'm probably going to update main branch to do this and put changes in main into a separate branch. |
So main should be sort of like mistral-flash-dec-2022 ... |
Hello, thanks so much for working on this! The code you provided above works! |
Hello, I installed your package using
setup/setup.sh
. The single-GPU command in the tutorial works fine, but when I run the multi-GPU commanddeepspeed --num_gpus 8 --num_nodes 2 --master_addr machine1 train.py --config conf/tutorial-gpt2-micro.yaml --nnodes 2 --nproc_per_node 8 --training_arguments.fp16 true --training_arguments.per_device_train_batch_size 4 --training_arguments.deepspeed conf/deepspeed/z2-small-conf.json --run_id tutorial-gpt2-micro-multi-node
I received an error message saying thatI also tried running the same code in the same environment but on a different machine, and this time I get the error message
Do you have any idea about how to resolve this issue? I installed all packages using
setup/setup.sh
so I guess my package versions follow what you included in the requirements files. Thanks!The text was updated successfully, but these errors were encountered: