What is the current state of proper CLI prompting? #6359

arch-btw · 2024-03-28T06:10:44Z

arch-btw
Mar 28, 2024

Hello,

Thank you all for your hard work!

I've ran into this issue with several models where it is not completely clear how to properly prompt it in llama.cpp.
It feels a bit hacky and I was wondering if there is a better way to go about this?

For example, ChatML models work properly now with the --chatml flag.
But there aren't any (or maybe some?) flags for other models as far as I'm aware (for the CLI that is).

While looking at main/README.md it becomes clear that we can use:

--reverse-prompt
--in-prefix
--in-suffix
--instruct
--interactive
--interactive-first

I'll give an example where this feels a bit hacky:

For Starling-LM-7B-beta-Q5_K_M.gguf the prompt should be as follows for multiturn conversations:

And what I'm using is this:

--interactive --interactive-first --reverse-prompt "<|end_of_turn|>" --in-prefix "GPT4 Correct User: " --in-suffix "<|end_of_turn|>GPT4 Correct Assistant:"

And it kinda works but is this really the best way to go about it?

Because:

<|end_of_turn|> is needed in the suffix for the first prompt, but after that it's automatically provided by the output of the model. So I can either 1) remove it, in which case the first prompt isn't correct, or 2) add it,... but then it's incorrectly printed twice in the second turn.
According to the readme "--in-prefix" is primarily used to "insert a space after the reverse prompt." So putting "GPT4 Correct User:" seems less than ideal. However, there is no other place to put it.
After entering the prompt a new line is added but that new line is not present in the prompt example from the model card. This could potentially impact the output?
Instruct mode can't be used here because according to the readme this will apply the Alpaca template.

I've looked at llama_apply_chat_template but this seems to apply to the server only.

Keep in mind, this is just an example with 1 model. The issues are somewhat different for other models.

Any help or clarity here would be greatly appreciated.

Thank you.

Jeximo · 2024-03-28T11:48:23Z

Jeximo
Mar 28, 2024

Proper CLI prompting is tricky, but once you get it then it's like riding a bike.

Here's an example to ensure main adheres to the template for models like Starling/openchat/ect..:

-i --in-prefix "GPT4 Correct User: " --in-suffix "<|end_of_turn|>GPT4 Correct Assistant:" -p "You are a helpful assistant.<|end_of_turn|>GPT4 Correct User: Hello.<|end_of_turn|>GPT4 Correct Assistant:"

You can also add --interactive-first, here's how:

-i --interactive-first --in-prefix "GPT4 Correct User: " --in-suffix "<|end_of_turn|>GPT4 Correct Assistant:" -p "You are a helpful Assistant.<|end_of_turn|>GPT4 Correct User: Hello.<|end_of_turn|>GPT4 Correct Assistant: Hello! How can I help you?<|end_of_turn|>"

Example --chatml:

-cml -e -p "You are an intelligent, loyal assistant."

Here's an example for Mistral Instruct:

-i --penalize-nl --temp 0 --in-prefix "[INST] " --in-suffix "[/INST]" -p "[INST] What's 5+5?[/INST] 5+5=10</s>"

0 replies

arch-btw · 2024-03-29T00:29:30Z

arch-btw
Mar 29, 2024
Author

I really appreciate your help @Jeximo , I will definitely be trying those.

Seems like in some of the examples you basically provide the first "Hello" from the assistant too. That's clever. It reminds me of those online webui's where the assistant says "How can I help you?" before the chat even started.

Thank you.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the current state of proper CLI prompting? #6359

{{title}}

Replies: 2 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

What is the current state of proper CLI prompting? #6359

arch-btw Mar 28, 2024

Replies: 2 comments

Jeximo Mar 28, 2024

arch-btw Mar 29, 2024 Author

arch-btw
Mar 28, 2024

Jeximo
Mar 28, 2024

arch-btw
Mar 29, 2024
Author