Skip to content

πŸ€— A Python script for efficiently downloading and reconstructing large Hugging Face model files by splitting them into manageable chunks

License

Notifications You must be signed in to change notification settings

sioaeko/HFModelDownloader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ€— Hugging Face Model Downloader

Hugging Face Model Downloader

A Python script for efficiently downloading and reconstructing large Hugging Face model files by splitting them into manageable chunks.

Python License

✨ Features

  • πŸš€ Downloads large model files from Hugging Face in multiple parts simultaneously
  • πŸ”— Automatically extracts download links from the model page
  • πŸ”§ Allows customization of the number of parts for splitting files
  • 🧩 Combines downloaded parts back into the original file
  • πŸ“Š Displays download progress for each file

πŸ›  Requirements

  • Python 3.6+
  • requests library
  • tqdm library

πŸ“₯ Installation

  1. Clone this repository:
git clone https://github.com/sioaeko/huggingface-split-downloader.git
cd huggingface-split-downloader
  1. Install the required dependencies:
pip install requests tqdm

πŸš€ Usage

Run the script with the following command:

python huggingface_split_downloader.py <model_url> <output_dir> --parts <number_of_parts>

Arguments:

  • <model_url>: The URL of the Hugging Face model page
  • <output_dir>: The directory where you want to save the downloaded files
  • --parts: (Optional) The number of parts to split each file into (default is 5)

Example:

python huggingface_split_downloader.py https://huggingface.co/gpt2 ./downloaded_model --parts 10

This command will download the GPT-2 model files, splitting each file into 10 parts, and save them in the ./downloaded_model directory.

πŸ”§ How it works

  1. πŸ” Access the provided Hugging Face model page and extract download links
  2. For each file:
    • πŸ“ Determine the file size
    • βœ‚οΈ Split the download into the specified number of parts
    • πŸ“₯ Download each part concurrently
    • πŸ“Š Show a progress bar for the download
  3. 🧩 Combine parts back into the original file
  4. πŸ—‘οΈ Delete partial files after successful combination

πŸ“ Notes

  • πŸ’Ύ Ensure you have sufficient storage space for the model files
  • 🌐 Download speed may vary depending on your internet connection
  • πŸ”„ The script may need adjustments if the structure of the Hugging Face website changes

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

⚠️ Disclaimer

This tool is for educational and research purposes. Always ensure you have the right to download and use the models as per Hugging Face's terms of service.

About

πŸ€— A Python script for efficiently downloading and reconstructing large Hugging Face model files by splitting them into manageable chunks

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages