Replies: 2 comments
-
Proper CLI prompting is tricky, but once you get it then it's like riding a bike. Here's an example to ensure
You can also add
Example
Here's an example for Mistral Instruct:
|
Beta Was this translation helpful? Give feedback.
-
I really appreciate your help @Jeximo , I will definitely be trying those. Seems like in some of the examples you basically provide the first "Hello" from the assistant too. That's clever. It reminds me of those online webui's where the assistant says "How can I help you?" before the chat even started. Thank you. |
Beta Was this translation helpful? Give feedback.
-
Hello,
Thank you all for your hard work!
I've ran into this issue with several models where it is not completely clear how to properly prompt it in llama.cpp.
It feels a bit hacky and I was wondering if there is a better way to go about this?
For example, ChatML models work properly now with the
--chatml
flag.But there aren't any (or maybe some?) flags for other models as far as I'm aware (for the CLI that is).
While looking at main/README.md it becomes clear that we can use:
I'll give an example where this feels a bit hacky:
For Starling-LM-7B-beta-Q5_K_M.gguf the prompt should be as follows for multiturn conversations:
"GPT4 Correct User: Hello<|end_of_turn|>GPT4 Correct Assistant: Hi<|end_of_turn|>GPT4 Correct User: How are you today?<|end_of_turn|>GPT4 Correct Assistant:"
And what I'm using is this:
--interactive --interactive-first --reverse-prompt "<|end_of_turn|>" --in-prefix "GPT4 Correct User: " --in-suffix "<|end_of_turn|>GPT4 Correct Assistant:"
And it kinda works but is this really the best way to go about it?
Because:
<|end_of_turn|>
is needed in the suffix for the first prompt, but after that it's automatically provided by the output of the model. So I can either 1) remove it, in which case the first prompt isn't correct, or 2) add it,... but then it's incorrectly printed twice in the second turn.According to the readme "
--in-prefix
" is primarily used to "insert a space after the reverse prompt." So putting "GPT4 Correct User:
" seems less than ideal. However, there is no other place to put it.After entering the prompt a new line is added but that new line is not present in the prompt example from the model card. This could potentially impact the output?
Instruct mode can't be used here because according to the readme this will apply the Alpaca template.
I've looked at llama_apply_chat_template but this seems to apply to the server only.
Keep in mind, this is just an example with 1 model. The issues are somewhat different for other models.
Any help or clarity here would be greatly appreciated.
Thank you.
Beta Was this translation helpful? Give feedback.
All reactions