You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
torchchat currently uses the hf hub which has it's own model cache, torchchat copies it into it's own model directory so you end up two copies of the same model.
We should leverage the hf hub cache but not force users to use that location if they're using their own models.
Alternatives
No response
Additional context
From r/localllama
"One annoying thing is that it uses huggingface_hub for downloading but doesn't use the HF cache - it uses it's own .torchtune folder to store models so you just end up having double of full models (grr). Just use the defaul HF cache location.”
RFC (Optional)
No response
The text was updated successfully, but these errors were encountered:
This is an option to use a rust-based downloader. HF claims it's production ready, but just has a slightly worse UX. The 80/20 of this is that a majority people should get much faster download speeds (super useful for LLMs), and those who run into errors could fallback to the Python implementation with a quick pip install
🚀 The feature, motivation and pitch
torchchat currently uses the hf hub which has it's own model cache, torchchat copies it into it's own model directory so you end up two copies of the same model.
We should leverage the hf hub cache but not force users to use that location if they're using their own models.
Alternatives
No response
Additional context
From r/localllama
"One annoying thing is that it uses huggingface_hub for downloading but doesn't use the HF cache - it uses it's own .torchtune folder to store models so you just end up having double of full models (grr). Just use the defaul HF cache location.”
RFC (Optional)
No response
The text was updated successfully, but these errors were encountered: