-
Notifications
You must be signed in to change notification settings - Fork 405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEAT] Define model for Embedding #276
Comments
We're refactoring how LLMs work and separating generation/completion from embeddings, which will address the above. We'll be releasing this in the new year. |
Other inference providers are now supported via a proxy such as LiteLLM. |
Would it be possible to allow users to define the model name they'd like to use, rather than being limited to "gpt-4o-mini" and OpenAI's embeddings, without the need for a proxy server like LiteLLM? Many inference backends, such as Ollama, are compatible with OpenAI's chat and embeddings endpoints. However, the inability to modify the model name prevents us from utilizing this compatibility. While, as you mentioned, we could use LiteLLM as a proxy server to reroute requests and overwrite the model name, this approach adds significant complexity. I'm currently exploring long-term memory integration for my project, Open-LLM-VTuber, and setting up LiteLLM with rerouting for LLM and embeddings can present a serious challenge for many of my users. Given how beneficial the ability to change the model name would be, I strongly recommend considering allowing users to set custom model name for LLM and embeddings. |
There are data privacy, regulatory, and sovereignty issues associated with using OpenAI embeddings. |
Problem
I am using LocalAI with Zep.
I can define model for llm itself, but It's needed also to define model for embeddings, because It seems that now model is hardcoded to
text-embedding-ada-002
.Possible solution
Add and use
model
key in embeddings options like this:The text was updated successfully, but these errors were encountered: