-
Notifications
You must be signed in to change notification settings - Fork 531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor the llama.cpp interface #1298
Conversation
cd70a83
to
c3951ae
Compare
dec4bc0
to
ea4454c
Compare
@torymur this is ready for review |
outlines/generate/__init__.py
Outdated
regex_string, model.tokenizer | ||
) | ||
else: | ||
self.output_description = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You only manipulating with self.output_description
and don't assign self.model = model
, which would fully redefine class attrs in __init__
and could be okay, but for such partial redefinition in @dataclass
case I see people suggest __post_init__
, which will be called after __init__
? But I'm not sure here.
Overall, took me a second to understand this fork: APIModel
<-> output type
, Model
<-> processor
. I can't put my finger on it with a proper design idea, but the fact that we have two good matching pairs suggests a good potential for separation? Or not, if uniting is what's its for.
There is also a simplification of code, but it doesn't help to get the idea of what's happening (your example is much clearer on why), so I'm going to leaving it here just in case:
if isinstance(model, get_args(APIModel)) or output_type is None:
self.output_description = output_type
else:
regex_string = output_type.to_regex()
self.output_description = RegexLogitsProcessor(regex_string, model.tokenizer)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I agree, maintain a single interface for API-based models and local models is annoying.
One possible way out of this design issue is to have models build and hold the LogitsProcessor
but it does feel unintuitive. The other solution is to add a Generator
building function that dispatches to an APIGenerator
and a LogitsProcessorGenerator
, but I feel the duplication of code is not necessarily justified. It might be more a case of documenting the behavior better?
I'll see if I can come up with something better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other solution is to add a Generator building function that dispatches to an APIGenerator and a LogitsProcessorGenerator, but I feel the duplication of code is not necessarily justified.
Actually, this might be enough, I'd vote for duplications like this to make design more self-descriptive rather than bringing it on a higher level of clarifications in docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made the change. It actually does not look like duplication so much, and makes the code much easier to follow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was worth it, crystal clear now 🔥
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for delay!
Nice to see The Generator in docs & in the code 🔥 Left some minor comments, but overall LGTM! 🚀
1eca31e
to
d6d56dc
Compare
In this PR I refactor the interface to the models provided by
llama-cpp-python
for the release of Outlines' v1.0. A few notable changes:lora
method as it has been deprecated inllama-cpp-python
from_pretrained
classmethod that takes a repo name and filename.TODO
See if the class can inherit fromwould makeLlama
llama-cpp-python
a hard dependency