This is the repo for the article Set up a local LLM on CPU with chat UI in 15 minutes.
The process consists of these simple steps:
- Select a model on Huggingface, e.g. "RJuro/munin-neuralbeagle-7b"
- Quantize the model by running
quantize.py
- Wrap model in Ollama image
- Build and run a Docker container that wraps the GUI, e.g. Chatbot Ollama