Added prompt based testing of text generation models #452

MohitIntel · 2023-10-06T22:16:23Z

What does this PR do?

Enables prompt based tests for text generation models and compares the performance with baseline numbers.

Disabling deepspeed prompt case for now

HuggingFaceDocBuilderDev · 2023-10-06T22:23:04Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

regisss

@MohitIntel Thanks for opening this PR as this is truly needed! I think we should measure the throughput and the accuracy in the same test (kind of the same as it is done for training). How long does it take to generate 1024 tokens with this prompt?

MohitIntel · 2023-10-10T19:07:37Z

@MohitIntel Thanks for opening this PR as this is truly needed! I think we should measure the throughput and the accuracy in the same test (kind of the same as it is done for training). How long does it take to generate 1024 tokens with this prompt?

@regisss , I enabled accuracy check in the latest commit. The total time for running prompt based perf and accuracy tests for both bloomz-7b1 and llama-v2-13b-hf is nearly 26 minutes.

Here is the timing breakdown for generating 1024 tokens:
bloomz-7b1 - bf16 --> 5m:36s
bloomz-7b1 - fp32 --> 8m:19s
llama-2-13b-hf - bf16 --> 4m:39s
llama-2-13b-hf - fp32 --> 7m:12s

regisss · 2023-10-11T10:33:42Z

@MohitIntel Could we reduce the size of the prompt and the number of tokens to generate? 128 for both could be good enough no?
Asking this because I will duplicate the CI at some point so that it runs on main branches and on stable releases.

MohitIntel added 4 commits October 4, 2023 10:35

Test text generation using prompts

22ad463

Update model path

f18646d

Update baseline numbers

a0f6f96

Update test_text_generation_example.py

1732836

Disabling deepspeed prompt case for now

MohitIntel requested a review from regisss as a code owner October 6, 2023 22:16

Fixed styling

a0280b0

MohitIntel requested review from ssarkar2 and libinta October 6, 2023 22:25

regisss reviewed Oct 8, 2023

View reviewed changes

Enabled accuracy checks

077be84

Added gaudi2 config in the baseline json

30afc0a

MohitIntel closed this Apr 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added prompt based testing of text generation models #452

Added prompt based testing of text generation models #452

MohitIntel commented Oct 6, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Oct 6, 2023

regisss left a comment

MohitIntel commented Oct 10, 2023 •

edited

Loading

regisss commented Oct 11, 2023

Added prompt based testing of text generation models #452

Added prompt based testing of text generation models #452

Conversation

MohitIntel commented Oct 6, 2023 • edited Loading

What does this PR do?

HuggingFaceDocBuilderDev commented Oct 6, 2023

regisss left a comment

Choose a reason for hiding this comment

MohitIntel commented Oct 10, 2023 • edited Loading

regisss commented Oct 11, 2023

MohitIntel commented Oct 6, 2023 •

edited

Loading

MohitIntel commented Oct 10, 2023 •

edited

Loading