Provide a method of caching for the large wheels. #6591

jfmherokiller · 2024-12-19T23:10:59Z

Description

Setup a download setup (if possible) wherein the big wheels:
llama_cpp_python_cuda_tensorcores
exllamav2
llama_cpp_python_cuda
are downloaded once and then hash checked locally instead of being redownloaded every single time.

This should be done because these combined wheels are about 1GB and fail to be cached by pip. This is seemingly due to them being hosted by GitHub.

I will admit this seems to be a general issue with pip as seen here https://discuss.python.org/t/what-are-the-caching-rules-for-wheels-installed-from-urls/21594/2

jfmherokiller added the enhancement New feature or request label Dec 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide a method of caching for the large wheels. #6591

Provide a method of caching for the large wheels. #6591

jfmherokiller commented Dec 19, 2024 •

edited

Loading

Provide a method of caching for the large wheels. #6591

Provide a method of caching for the large wheels. #6591

Comments

jfmherokiller commented Dec 19, 2024 • edited Loading

jfmherokiller commented Dec 19, 2024 •

edited

Loading