Llama.turbine

This is an experimental pytorch model that can load from llama.cpp gguf files, run eagerly and compile to Turbine/IREE. We are using it to evaluate different approaches for interfacing with the llama.cpp ecosystem.

Setup instructions

Prerequisite

Setup SHARK-Turbine and it's python required environment.

Prepping model

python python/turbine_llamacpp/model_downloader.py --hf_model_name="openlm-research/open_llama_3b"

by this point you should see a directory named downloaded_open_llama_3b in you working directory. This is typically the downloaded_<name_of_your_model>.

Setup Llama.cpp

git clone https://github.com/ggerganov/llama.cpp
pip install gguf

Converting HF model to GGUF.

# Convert HF to GGUF and quantize to q8.
python llama.cpp/convert.py downloaded_open_llama_3b --outfile ggml-model-q8_0.gguf --outtype q8_0

Running Llama.turbine

Running on pytorch

python python/turbine_llamacpp/model.py  --gguf_path=/path/to/ggml-model-q8_0.gguf

Generating MLIR

python python/turbine_llamacpp/compile.py  --gguf_path=python/turbine_llamacpp/ggml-model-q8_0.gguf

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
experimental/paged_attention		experimental/paged_attention
python		python
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
pyproject.toml		pyproject.toml
repack_gguf_params.py		repack_gguf_params.py
setup.cfg		setup.cfg
setup.py		setup.py
version_info.json		version_info.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Llama.turbine

Setup instructions

Prerequisite

Prepping model

Setup Llama.cpp

Converting HF model to GGUF.

Running Llama.turbine

About

Releases

Packages

Contributors 3

Languages

License

stellaraccident/llama.turbine

Folders and files

Latest commit

History

Repository files navigation

Llama.turbine

Setup instructions

Prerequisite

Prepping model

Setup Llama.cpp

Converting HF model to GGUF.

Running Llama.turbine

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages