-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatically apply chat template in non-chat scenarios #1533
base: master
Are you sure you want to change the base?
Conversation
f1ece12
to
e5fa889
Compare
If using a chat template is supposed to be a default behavior for .generate() method it is not aligned with HF Transformers lib. We should turn it off in the tools at least (both WWB and LLM-Bench). |
what's about HF e2e pipeline? |
|
what if it's instruction model? |
Double-checked, and it seems like HF changed the behaviour at some point for text-generation pipeline. Details. But the input should be formatted appropreatelly to trigger chat template usage. If the user just passes a string data, no chat template is applied. |
do you think it's better to add explicit flag, then? pipe.generate(prompt, apply_chat_template=True, max_new_tokens=40) |
This option looks good to me but for drop-in replacement of HF API to OV GenAI it is better to follow HF approach with message format. Anyway, they should have more experience and user's feedback. |
Should both ways be added - possibility to put |
34e5dfd
to
4d3783f
Compare
8f82c7d
to
75f7e8e
Compare
418a10c
to
a4f4158
Compare
82b4ab6
to
50cd916
Compare
CVS-157276