Load Phi 3 small on Nvidia Tesla V100 - Flash Attention #1956

BigDataMLexplorer · 2024-07-25T12:05:53Z

BigDataMLexplorer
Jul 25, 2024

Hi,
I would like to inquire about the possibility of uploading and fine tuning a Phi 3 8k small. When I load the model, I get an error about missing Flash attention. If I want to install the given package, I get this error :

RuntimeError: FlashAttention is only supported on CUDA 11.6 and above.  Note: make sure nvcc has a supported version by running nvcc -V.


      torch.__version__  = 2.3.1+cu121

But I have the required version of pytorch and CUDA (torch 2.3.1 and cuda 12.1)
Is it because I am using a Tesla V100 graphics card? Is there any way to load the model also with this graphics card?
I found this in the documentation for the Phi 3 mini on Huggingface:

If you want to run the model on:

NVIDIA V100 or earlier generation GPUs: call AutoModelForCausalLM.from_pretrained() with attn_implementation="eager"

Does this also apply to the Phi3 Small 8k?? Beacause when I try to load it, the error occurs

model = AutoModelForSequenceClassification.from_pretrained("path", num_labels=num_labels ,attn_implementation="eager" )

AssertionError: Flash Attention is not available, but is needed for dense attention

Or should I try the ONNX version or it is just for inference?
Thank you.

BenjaminBossan · 2024-07-25T12:35:30Z

BenjaminBossan
Jul 25, 2024
Maintainer

Hey, this seems to be unrelated to PEFT, right? So please open a discussion for transformers instead, which looks like the right place.

Or should I try the ONNX version or it is just for inference?

ONNX is not for training, so it's not an option here.

4 replies

BigDataMLexplorer Jul 25, 2024
Author

Hi, yes. I dont know if it is some bug or not, but I don't see discussion section in trasnformers and other huggingface packages, only the peft.

BenjaminBossan Jul 25, 2024
Maintainer

It could possibly be a bug, as you follow the instructions and still get an error.

These types of questions can be asked here: https://discuss.huggingface.co/

If you don't get a response, consider creating an issue. Of course, always first search if a similar issue already exists.

BigDataMLexplorer Jul 25, 2024
Author

I was thinking more like, does the transformers library have a discussion section on github too? I don't see it in the library bar like I do for peft, so I created an issue for transformers.

BenjaminBossan Jul 25, 2024
Maintainer

No, discussions are not activated for the transformers repo. Good luck with this issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Load Phi 3 small on Nvidia Tesla V100 - Flash Attention #1956

{{title}}

Replies: 1 comment 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Load Phi 3 small on Nvidia Tesla V100 - Flash Attention #1956

BigDataMLexplorer Jul 25, 2024

Replies: 1 comment · 4 replies

BenjaminBossan Jul 25, 2024 Maintainer

BigDataMLexplorer Jul 25, 2024 Author

BenjaminBossan Jul 25, 2024 Maintainer

BigDataMLexplorer Jul 25, 2024 Author

BenjaminBossan Jul 25, 2024 Maintainer

BigDataMLexplorer
Jul 25, 2024

Replies: 1 comment 4 replies

BenjaminBossan
Jul 25, 2024
Maintainer

BigDataMLexplorer Jul 25, 2024
Author

BenjaminBossan Jul 25, 2024
Maintainer

BigDataMLexplorer Jul 25, 2024
Author

BenjaminBossan Jul 25, 2024
Maintainer