[WIP, do not merge] Integrate FlashAttention into OPT fine-tuning #1

DanFu09 · 2022-11-10T18:59:15Z

Integration of FlashAttention into OPT fine-tuning, starting with causal self-attention only.

To test this, install FlashAttention (requires CUDA 11, NVCC, and a Turing or Ampere GPU):

git clone https://github.com/HazyResearch/flash-attention.git
cd flash-attention
python setup.py install

Integrate FlashAttention into OPT fine-tuning

5af0baa

DanFu09 changed the title ~~Integrate FlashAttention into OPT fine-tuning~~ [WIP, do not merge] Integrate FlashAttention into OPT fine-tuning Nov 10, 2022

Run FlashAttention for decoder

a5d8ef8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP, do not merge] Integrate FlashAttention into OPT fine-tuning #1

[WIP, do not merge] Integrate FlashAttention into OPT fine-tuning #1

DanFu09 commented Nov 10, 2022

[WIP, do not merge] Integrate FlashAttention into OPT fine-tuning #1

Are you sure you want to change the base?

[WIP, do not merge] Integrate FlashAttention into OPT fine-tuning #1

Conversation

DanFu09 commented Nov 10, 2022