Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leverage the HF cache for models #992

Open
byjlw opened this issue Aug 1, 2024 · 2 comments · May be fixed by #1285
Open

Leverage the HF cache for models #992

byjlw opened this issue Aug 1, 2024 · 2 comments · May be fixed by #1285
Labels
actionable Items in the backlog waiting for an appropriate impl/fix enhancement New feature or request

Comments

@byjlw
Copy link
Contributor

byjlw commented Aug 1, 2024

🚀 The feature, motivation and pitch

torchchat currently uses the hf hub which has it's own model cache, torchchat copies it into it's own model directory so you end up two copies of the same model.

We should leverage the hf hub cache but not force users to use that location if they're using their own models.

Alternatives

No response

Additional context

From r/localllama
"One annoying thing is that it uses huggingface_hub for downloading but doesn't use the HF cache - it uses it's own .torchtune folder to store models so you just end up having double of full models (grr). Just use the defaul HF cache location.”

RFC (Optional)

No response

@Jack-Khuu Jack-Khuu added enhancement New feature or request actionable Items in the backlog waiting for an appropriate impl/fix labels Aug 1, 2024
@orionr
Copy link
Contributor

orionr commented Aug 1, 2024

Great job bringing these back as issues! Is this also a problem with torchtune given that we're using .torchtune for this? cc @kartikayk ?

@vmpuri
Copy link
Contributor

vmpuri commented Oct 9, 2024

Going to add that we can use hf_transfer to "potentially double the download speed"
https://huggingface.co/docs/hub/models-downloading
https://huggingface.co/docs/huggingface_hub/v0.25.1/package_reference/environment_variables#hfhubenablehftransfer

This is an option to use a rust-based downloader. HF claims it's production ready, but just has a slightly worse UX. The 80/20 of this is that a majority people should get much faster download speeds (super useful for LLMs), and those who run into errors could fallback to the Python implementation with a quick pip install

@vmpuri vmpuri linked a pull request Oct 9, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
actionable Items in the backlog waiting for an appropriate impl/fix enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants