-
Notifications
You must be signed in to change notification settings - Fork 311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate KBNF grammar #815
base: master
Are you sure you want to change the base?
Conversation
Code Metrics Report=============================================================================== Language Files Lines Code Comments Blanks =============================================================================== C Header 2 35 28 0 7 Dockerfile 1 34 25 0 9 Happy 1 442 369 0 73 JSON 12 105 104 0 1 Python 52 2268 1930 69 269 TOML 20 625 559 2 64 YAML 2 21 19 2 0 ------------------------------------------------------------------------------- Jupyter Notebooks 4 0 0 0 0 |- Markdown 2 77 32 31 14 |- Python 2 196 169 1 26 (Total) 273 201 32 40 ------------------------------------------------------------------------------- Markdown 38 2759 0 2093 666 |- BASH 6 103 100 0 3 |- JSON 1 12 12 0 0 |- Python 5 92 82 0 10 |- Rust 9 322 274 0 48 |- TOML 2 75 63 0 12 (Total) 3363 531 2093 739 ------------------------------------------------------------------------------- Rust 260 75643 68177 1547 5919 |- Markdown 123 1217 25 1117 75 (Total) 76860 68202 2664 5994 =============================================================================== Total 393 81932 71211 3713 7008 =============================================================================== |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Dan-wanna-M I was wondering if you could perhaps review some of this code and take a look at some of the review comments I made?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the grammars here correct? Perhaps we can have a simpler example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They look mostly correct to me; I have added some comments that can make the grammar more familiar and/or simpler to general users. In terms of simpler example, I think we can use something within regular grammar's capacity but can be more clearly written in kbnf. I think markdown list and the kbnf for a concrete json schema both qualify. Which one do you prefer, or you would like some examples in other areas?
Ok(second_sampled) | ||
} | ||
KbnfGrammarBias::FinishedGeneration => { | ||
todo!() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Need to mark the sequence as done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering if this file generally looks correct? It is paired with the sampling routines in sampling.rs
and initialized in engine/mod.rs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@EricLBuehler I have reviewed the PR and suggested some changes.
(* JSON Grammar *) | ||
|
||
(* JSON text must contain a single JSON value *) | ||
start = value ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be start ::= value;
| "null" ; | ||
|
||
(* A JSON object is a collection of key/value pairs enclosed in curly braces *) | ||
object ::= "{" [ members ] "}" ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use both [members]
and members?
to express optional nonterminal
|
||
(* A JSON object is a collection of key/value pairs enclosed in curly braces *) | ||
object ::= "{" [ members ] "}" ; | ||
members ::= pair { "," pair } ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use both {members}
and members*
to express 0 or infinite repetition.
elements ::= value { "," value } ; | ||
|
||
(* A JSON string is a sequence of Unicode characters enclosed in double quotes *) | ||
string ::= "\"" { character } "\"" ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use regex in kbnf, see https://docs.rs/kbnf/latest/kbnf/#regular-expression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They look mostly correct to me; I have added some comments that can make the grammar more familiar and/or simpler to general users. In terms of simpler example, I think we can use something within regular grammar's capacity but can be more clearly written in kbnf. I think markdown list and the kbnf for a concrete json schema both qualify. Which one do you prefer, or you would like some examples in other areas?
|
||
client = OpenAI(api_key="foobar", base_url="http://localhost:1234/v1/") | ||
|
||
JSON_KBNF = ''' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may also want to take a look at https://github.com/Dan-wanna-M/formatron/blob/master/src/formatron/formats/json.py where I put a json grammar definition. It also contains whitespace pattern that makes llm generate better json.
|
||
impl KbnfGrammar { | ||
pub fn new(grammar: &str, tokenizer: &Tokenizer) -> Result<Self> { | ||
let tokenizer_vocab = tokenizer.get_vocab(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we still need to "unpreprocess" the vocabulary or the control ASCII characters(like \n
) and all non-ASCII characters will not be recognized correctly.
let mut bias = vec![0f32; self.vocab_size]; | ||
match self.engine.mask_logits(&mut bias) { | ||
Ok(()) => { | ||
let new_logits = (logits.to_device(&Device::Cpu)?.to_dtype(DType::F32)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might not be a good idea to move logits back to CPU due to potential CUDA synchronization issues. I think we can benchmark this to check latency. If it does bring too much latency, we can follow a strategy like this file which essentially only moves the indices tensor(tends to be smaller than the whole logits) to GPU and operate there.
} | ||
} | ||
|
||
/// Add a token, also to the trie. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
technically speaking the cache is a hashmap, not a trie.
No description provided.