Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue with training with flash attention on #6590

Open
1 task done
ashunaveed opened this issue Dec 19, 2024 · 0 comments
Open
1 task done

issue with training with flash attention on #6590

ashunaveed opened this issue Dec 19, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@ashunaveed
Copy link

Describe the bug

while training a llama 3.1 model with flash attention on and text file, rank:1024, microbatch size:1 or 2 or 4, got following error:
/Documents/text-gen/installer_files/env/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 460, in forward
attn_output = _flash_attention_forward(
^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: _flash_attention_forward() got an unexpected keyword argument 'num_items_in_batch'

22:42:13-041312 INFO Training complete, saving
22:42:15-181788 INFO Training complete!

Is there an existing issue for this?

  • I have searched the existing issues

Reproduction

Training a text file with flsh attention on.

Screenshot

No response

Logs

/Documents/text-gen/installer_files/env/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 460, in forward
    attn_output = _flash_attention_forward(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: _flash_attention_forward() got an unexpected keyword argument 'num_items_in_batch'

22:42:13-041312 INFO     Training complete, saving                                                                                                                                      
22:42:15-181788 INFO     Training complete!

System Info

2 nvidia 4090 gpu
@ashunaveed ashunaveed added the bug Something isn't working label Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant