Skip to content

gpt_big_code: make flash attention impl quantization friendly #5117

gpt_big_code: make flash attention impl quantization friendly

gpt_big_code: make flash attention impl quantization friendly #5117

Annotations

2 warnings

Run tests for optimum.habana.transformers

succeeded Sep 25, 2024 in 5m 28s