MLIR LLM runner

Prerequisite

Setup SHARK-Turbine and it's python required environment.

Convert Model to IRPA

model="BEE-spoke-data/verysmol_llama-v11-KIx2"
model_name="$(basename "$model_name" | sed 's:-:_:g')"
iree-convert-parameters --parameters="$(huggingface-cli download  "$model")"/model.safetensors --output="$model_name".irpa

Generate MLIR with SHARK-Turbine

python models/turbine_models/custom_models/stateless_llama.py --compile_to=linalg  --precision="f32" --hf_model_name="$model" --external_weight_file "$model_name".irpa --external_weights="safetensors" --streaming_llm

Compile MLIR

iree-compile  --iree-opt-const-eval=false --iree-hal-target-backends=llvm-cpu --iree-stream-resource-index-bits=64 --iree-vm-target-index-bits=64 --iree-llvmcpu-enable-ukernels=mmt4d --iree-llvmcpu-narrow-matmul-tile-bytes=16777216 --iree-global-opt-enable-quantized-matmul-reassociation=true --iree-global-opt-propagate-transposes "$model_name.mlir -o "$model_name".vmfb

Run compiled model with parameters

python mlir_llm_runner.py --model="$model_name".irpa --parameters="$model_name".vmfb --prompt 'MLIR is ' )

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
python/turbine_llamacpp		python/turbine_llamacpp
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
mlir_llm_runner.py		mlir_llm_runner.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MLIR LLM runner

Prerequisite

Convert Model to IRPA

Generate MLIR with SHARK-Turbine

Compile MLIR

Run compiled model with parameters

About

Releases

Packages

Languages

License

persimmonsai/mlir-llm-runner

Folders and files

Latest commit

History

Repository files navigation

MLIR LLM runner

Prerequisite

Convert Model to IRPA

Generate MLIR with SHARK-Turbine

Compile MLIR

Run compiled model with parameters

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages