Refactor the llama.cpp interface #1298

rlouf · 2024-11-29T10:48:15Z

In this PR I refactor the interface to the models provided by llama-cpp-python for the release of Outlines' v1.0. A few notable changes:

I removed the lora method as it has been deprecated in llama-cpp-python
We follow the init API of the library, with the default requiring a model path, and a from_pretrained classmethod that takes a repo name and filename.
We don't try to normalize the arguments either to initialization or to text generation. Users can pass the same arguments as their would pass to the original model.

TODO

Update the tests
~~See if the class can inherit from Llama~~ would make llama-cpp-python a hard dependency
Update the LlamaCpp documentation
Update the OpenAI documentation
Update the Gemini documentation
Update the Anthropic documentation

rlouf · 2024-12-02T12:59:38Z

@torymur this is ready for review

docs/reference/models/llamacpp.md

docs/reference/models/openai.md

torymur · 2024-12-06T22:00:19Z

outlines/generate/__init__.py

+                    regex_string, model.tokenizer
+                )
+            else:
+                self.output_description = None


You only manipulating with self.output_description and don't assign self.model = model, which would fully redefine class attrs in __init__ and could be okay, but for such partial redefinition in @dataclass case I see people suggest __post_init__, which will be called after __init__? But I'm not sure here.

Overall, took me a second to understand this fork: APIModel <-> output type, Model <-> processor. I can't put my finger on it with a proper design idea, but the fact that we have two good matching pairs suggests a good potential for separation? Or not, if uniting is what's its for.

There is also a simplification of code, but it doesn't help to get the idea of what's happening (your example is much clearer on why), so I'm going to leaving it here just in case:

if isinstance(model, get_args(APIModel)) or output_type is None: self.output_description = output_type else: regex_string = output_type.to_regex() self.output_description = RegexLogitsProcessor(regex_string, model.tokenizer)

Yeah I agree, maintain a single interface for API-based models and local models is annoying.

One possible way out of this design issue is to have models build and hold the LogitsProcessor but it does feel unintuitive. The other solution is to add a Generator building function that dispatches to an APIGenerator and a LogitsProcessorGenerator, but I feel the duplication of code is not necessarily justified. It might be more a case of documenting the behavior better?

I'll see if I can come up with something better.

The other solution is to add a Generator building function that dispatches to an APIGenerator and a LogitsProcessorGenerator, but I feel the duplication of code is not necessarily justified.

Actually, this might be enough, I'd vote for duplications like this to make design more self-descriptive rather than bringing it on a higher level of clarifications in docs

I made the change. It actually does not look like duplication so much, and makes the code much easier to follow.

It was worth it, crystal clear now 🔥

torymur

Sorry for delay!
Nice to see The Generator in docs & in the code 🔥 Left some minor comments, but overall LGTM! 🚀

rlouf added this to the 1.0 milestone Nov 29, 2024

rlouf added enhancement llama.cpp Related to the `llama.cpp` integration labels Nov 29, 2024

rlouf force-pushed the refactor-llamacpp branch 5 times, most recently from cd70a83 to c3951ae Compare November 29, 2024 21:34

rlouf requested a review from torymur November 30, 2024 11:31

rlouf linked an issue Nov 30, 2024 that may be closed by this pull request

Update the llama.cpp integration #1290

Open

rlouf force-pushed the refactor-llamacpp branch 2 times, most recently from dec4bc0 to ea4454c Compare December 2, 2024 12:58

rlouf marked this pull request as ready for review December 2, 2024 17:31

torymur reviewed Dec 6, 2024

View reviewed changes

docs/reference/models/llamacpp.md Show resolved Hide resolved

torymur reviewed Dec 6, 2024

View reviewed changes

docs/reference/models/llamacpp.md Show resolved Hide resolved

torymur reviewed Dec 6, 2024

View reviewed changes

docs/reference/models/openai.md Show resolved Hide resolved

torymur reviewed Dec 6, 2024

View reviewed changes

torymur approved these changes Dec 6, 2024

View reviewed changes

rlouf added 9 commits December 16, 2024 13:37

Refactor the llama.cpp interface

d96f4e3

Add LocalModel and APIModel types

103b6ae

Add Generator function and builder

a97db8b

Add to_regex method to the different types

af030c1

Support dataclasses to define Json Schema

aa7e310

Update the documentation of the llama.cpp integration

b5a02a1

Update the documentation of the OpenAI integration

2436f83

Update the documentation for the Gemini integration

e57995d

Update the documentation for the Anthropic integration

d6d56dc

rlouf force-pushed the refactor-llamacpp branch from 1eca31e to d6d56dc Compare December 16, 2024 12:38

Run tests when PR against v1.0 branch

98c1e30

rlouf requested a review from torymur December 16, 2024 12:45

torymur approved these changes Dec 16, 2024

View reviewed changes

rlouf merged commit b352c62 into dottxt-ai:v1.0 Dec 16, 2024
4 of 5 checks passed

rlouf deleted the refactor-llamacpp branch December 16, 2024 15:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor the llama.cpp interface #1298

Refactor the llama.cpp interface #1298

rlouf commented Nov 29, 2024 •

edited

Loading

rlouf commented Dec 2, 2024

torymur Dec 6, 2024 •

edited

Loading

rlouf Dec 8, 2024

torymur Dec 9, 2024

rlouf Dec 16, 2024

torymur Dec 16, 2024

torymur left a comment •

edited

Loading

Refactor the llama.cpp interface #1298

Refactor the llama.cpp interface #1298

Conversation

rlouf commented Nov 29, 2024 • edited Loading

TODO

rlouf commented Dec 2, 2024

torymur Dec 6, 2024 • edited Loading

Choose a reason for hiding this comment

rlouf Dec 8, 2024

Choose a reason for hiding this comment

torymur Dec 9, 2024

Choose a reason for hiding this comment

rlouf Dec 16, 2024

Choose a reason for hiding this comment

torymur Dec 16, 2024

Choose a reason for hiding this comment

torymur left a comment • edited Loading

Choose a reason for hiding this comment

rlouf commented Nov 29, 2024 •

edited

Loading

torymur Dec 6, 2024 •

edited

Loading

torymur left a comment •

edited

Loading