-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changed default local model to nomic #1943
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
backend/Dockerfile.model_server
Outdated
@@ -21,10 +21,12 @@ RUN apt-get remove -y --allow-remove-essential perl-base && \ | |||
RUN python -c "from transformers import AutoModel, AutoTokenizer, TFDistilBertForSequenceClassification; \ | |||
from huggingface_hub import snapshot_download; \ | |||
AutoTokenizer.from_pretrained('danswer/intent-model'); \ | |||
AutoTokenizer.from_pretrained('intfloat/e5-base-v2'); \ | |||
AutoTokenizer.from_pretrained('nomic-ai/nomic-embed-text-v1'); \ | |||
AutoTokenizer.from_pretrained('nomic-ai/nomic-bert-2048'); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to make this work while airgapped, you need to .from_pretrained and snapshot_download not only nomic-ai/nomic-embed-text-v1, but also nomic-ai/nomic-bert-2048.
It was hard to find the exact reasoning for this, but I'm pretty sure it has something to do with nomic-embed-text-v1 being built on top of nomic-bert-2048 and that it needs to run .py scripts located only in the the nomic-bert-2048 repo here
ASYM_QUERY_PREFIX = os.environ.get("ASYM_QUERY_PREFIX", "query: ") | ||
ASYM_PASSAGE_PREFIX = os.environ.get("ASYM_PASSAGE_PREFIX", "passage: ") | ||
ASYM_QUERY_PREFIX = os.environ.get("ASYM_QUERY_PREFIX", "search_query: ") | ||
ASYM_PASSAGE_PREFIX = os.environ.get("ASYM_PASSAGE_PREFIX", "search_document: ") | ||
# Purely an optimization, memory limitation consideration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are the defaults for nomic-ai/nomic-embed-text-v1
model = SentenceTransformer(model_name) | ||
model = SentenceTransformer( | ||
model_name_or_path=model_name, trust_remote_code=True | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is related to the needing to also install nomic-bert-2048
there is a script that has to be executed to use the model (unsure when) that is located in nomic-bert-2048 and not in nomic-embed-text-v1 (a couple .py scripts you can see here)
Not 100% sure though
b1c41a9
to
ba16cea
Compare
3cb2ef8
to
bdfe36e
Compare
@@ -116,8 +116,9 @@ def get_tokenizer(model_name: str | None, provider_type: str | None) -> BaseToke | |||
if provider_type.lower() == "openai": | |||
# Used across ada and text-embedding-3 models | |||
return _check_tokenizer_cache("openai") | |||
# If we are given a cloud provider_type that isn't OpenAI, we default to trying to use the model_name | |||
# this means we are approximating the token count which may leave some performance on the table | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
general note
snapshot_download('danswer/intent-model'); \ | ||
snapshot_download('intfloat/e5-base-v2'); \ | ||
snapshot_download('mixedbread-ai/mxbai-rerank-xsmall-v1')" | ||
RUN python -c "from transformers import AutoTokenizer; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to combine these into a single layer. If you do a single RUN it creates a single layer that can be cached.
model = SentenceTransformer(model_name) | ||
model = SentenceTransformer( | ||
model_name_or_path=model_name, | ||
trust_remote_code=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add a comment here:
"Some model architectures that aren't built into the Transformers or Sentence Transformer need to be downloaded to be loaded locally. This does not mean data is sent to remote servers for inference, however the remote code can be fairly arbitrary so only use trusted models"
No description provided.