GitHub - kth8/bitnet: Run BitNet LLM in a container

Simple Docker image to run BitNet large language models. Uses CPU for inference. Includes Llama3-8B-1.58-100B-tokens-TQ2_0.gguf by default. Requires CPU with AVX2 support from Intel Haswell/AMD Excavator or later generations.

docker run --rm ghcr.io/kth8/bitnet

to use your own arguments, enter using shell:

docker run --rm -it ghcr.io/kth8/bitnet sh
python3 run_inference.py --help
python3 run_inference.py -m Llama3-8B-1.58-100B-tokens-TQ2_0.gguf <your arguments> -p "<your prompt>"

to use your own model, mount a volume from the host:

docker run --rm -it -v /some/host/path:/BitNet/models ghcr.io/kth8/bitnet sh
python3 run_inference.py -m models/<your model>.gguf <your arguments> -p "<your prompt>"

Check if your CPU supports AVX2 on Linux:

grep -o 'avx2' /proc/cpuinfo

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
Containerfile		Containerfile
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

License

kth8/bitnet

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages