Use fast softmax only on prefill #1180

wszczurekhabana · 2024-08-01T13:55:04Z

Original description:

Currently running fast softmax on decode can cause perf degradation on some configs. Thus this PR turns it off for decode.

Results:

Model	Batch Size	Nodes	Performance with Fast Softmax (Prefill Only)	Performance without Fast Softmax
Llama 2-70B 31744/1024 Tokens	12	8	105.27	97.08
Llama 2-70B 24576/8192 Tokens	16	8	418.95	405.55
Llama 2-70B 16384/16384 Tokens	24	8	673.99	665.84
Llama 2-70B 4096/4096 Tokens	16	2	304.17	303.75
Llama 2-70B 4096/4096 Tokens	59	4	1149.75	1147.10

wszczurekhabana · 2024-08-01T14:03:31Z

@libinta could you add 1.17_dependency label to this PR?

Use fast softmax only on prefill huggingface#1180

yafshar · 2024-08-02T12:04:27Z

@wszczurekhabana please close this PR it is the same as #1159

yafshar · 2024-08-02T12:09:57Z

@libinta #1159 should have the correct label

wszczurekhabana · 2024-08-05T07:44:59Z

Thanks, did not saw #1159 , closing.

Use fast softmax only on prefill

aa8752e

wszczurekhabana requested review from mandy-li, libinta and dvarshney-habana as code owners August 1, 2024 13:55

wszczurekhabana mentioned this pull request Aug 1, 2024

Use fast softmax only on prefill HabanaAI/optimum-habana-fork#244

Merged

libinta added the synapse 1.17_dependency PR not backward compatible can be merged only when synapse 1.17 is available. label Aug 1, 2024

vidyasiv added a commit to emascarenhas/optimum-habana that referenced this pull request Aug 2, 2024

Merge branch '1180' into syn1.17tr4.43

657f52d

Use fast softmax only on prefill huggingface#1180

wszczurekhabana closed this Aug 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use fast softmax only on prefill #1180

Use fast softmax only on prefill #1180

wszczurekhabana commented Aug 1, 2024

wszczurekhabana commented Aug 1, 2024

yafshar commented Aug 2, 2024

yafshar commented Aug 2, 2024

wszczurekhabana commented Aug 5, 2024

Use fast softmax only on prefill #1180

Use fast softmax only on prefill #1180

Conversation

wszczurekhabana commented Aug 1, 2024

wszczurekhabana commented Aug 1, 2024

yafshar commented Aug 2, 2024

yafshar commented Aug 2, 2024

wszczurekhabana commented Aug 5, 2024