Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sampler Expansion - GBNF Grammar #54

Closed
ianmarmour opened this issue Mar 3, 2024 · 4 comments · Fixed by #57
Closed

Sampler Expansion - GBNF Grammar #54

ianmarmour opened this issue Mar 3, 2024 · 4 comments · Fixed by #57
Labels
enhancement New feature or request

Comments

@ianmarmour
Copy link
Contributor

Currently the Sampler does not actually support sampling with grammar what's currently in place just seems like a placeholder and not actually used in sampling. I'm not sure if this fits directly in with #38 but would be another nice improvement. There's a few different options for implementing this we could probably take a dependency on the llama.cpp parser or we could try and roll a custom GBNF parser I'm not sure what would really be ideal here.

@pedro-devv
Copy link
Contributor

It's probably better to use a custom parser, as the llama.cpp parser does not have a C API, only CPP. Don't think it fits #38, but should be easy to further expand sampling after that is solved.

@pedro-devv pedro-devv added the enhancement New feature or request label Mar 4, 2024
@nkoppel
Copy link
Contributor

nkoppel commented Mar 4, 2024

@pedro-devv I think that this will fit #50, and this functionality can be implemented as a new SamplerStage. I would probably design it to store an opaque object that stores the parsed grammar and the current state.

@ianmarmour
Copy link
Contributor Author

For what it's worth I got a pretty poor implementation of this working yesterday that being said I'm fairly certain just a new SamplerStage isn't sufficient as there's an accept_token function that needs to be called with the result of token selection to feed the selected token into the grammar. Anecdotally it appears that if this doesn't happen we just loop and select the same tokens repeatedly. So to this get right we need the following,

  1. During sampling call llama_sample_grammar this could probably be done in a SamplerStage
  2. After token selection we need to call llama_grammar_accept_token

I'm assuming it probably makes the most sense to store the opaque grammar object in an Option attached to the Sampler and then just pass it through to the sampler stages when applicable as well as call accept_token after token selection within the sample method when applicable.

@nkoppel
Copy link
Contributor

nkoppel commented Mar 6, 2024

Well, sampler stages have access to the tokens in the current context. We should be able to handle this entirely within a SamplerStage if we call llama_grammar_accept_token with the last token in context if it exists. There may still be some complexities with this implementation, such as how to initialize the grammar's state, how to clone that SamplerStage, etc.

@pedro-devv pedro-devv linked a pull request Mar 12, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants