Skip to content

kth8/bitnet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Simple Docker image to run BitNet large language models. Uses CPU for inference. Includes Llama3-8B-1.58-100B-tokens-TQ2_0.gguf by default. Requires CPU with AVX2 support from Intel Haswell/AMD Excavator or later generations.

docker run --rm ghcr.io/kth8/bitnet

to use your own arguments, enter using shell:

docker run --rm -it ghcr.io/kth8/bitnet sh
python3 run_inference.py --help
python3 run_inference.py -m Llama3-8B-1.58-100B-tokens-TQ2_0.gguf <your arguments> -p "<your prompt>"

to use your own model, mount a volume from the host:

docker run --rm -it -v /some/host/path:/BitNet/models ghcr.io/kth8/bitnet sh
python3 run_inference.py -m models/<your model>.gguf <your arguments> -p "<your prompt>"

Check if your CPU supports AVX2 on Linux:

grep -o 'avx2' /proc/cpuinfo

About

Run BitNet LLM in a container

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages