v0.0.16
mistral.rs integration
Handle:
mistralrs
URL: http://localhost:33951
Mistral.rs is a fast LLM inference platform supporting inference on a variety of devices, quantization, and easy-to-use application with an Open-AI API compatible HTTP server and Python bindings.
Defaults
# Spin up harbor with mistral.rs service
harbor up mistralrs
# Launch the default UI, pre-configured with mistral.rs
harbor open
Plain Models
# For "plain" models:
# Download the model to the global HF cache
harbor hf download IlyaGusev/gemma-2-2b-it-abliterated
# Set model/type/arch
harbor mistralrs model IlyaGusev/gemma-2-2b-it-abliterated
harbor mistralrs type plain
harbor mistralrs arch gemma2
# Gemma 2 doesnt't support paged attention
harbor mistralrs args --no-paged-attn
# Run with the default settings
harbor up mistralrs
GGUF Models
# Set the model type to GGUF
harbor mistralrs type gguf
# - Unset ISQ off, as it's not supported
# for GGUF models
# - For GGUFs, architecture is inferred from the file
harbor mistralrs isq ""
harbor mistralrs arch ""
# Use "folder" specifier to point to the model
# "hf/full/path" - mounted HF cache. Note that you need
# a full path to the folder with .gguf
# "-f Model.gguf" - the model file
harbor mistralrs model "hf/hub/models--microsoft--Phi-3-mini-4k-instruct-gguf/snapshots/999f761fe19e26cf1a339a5ec5f9f201301cbb83/ -f Phi-3-mini-4k-instruct-q4.gguf"
# When configured, launch
harbor up mistralrs
CLI
# Show service API docs (local, when running)
harbor mistralrs docs
# Check if the service is health
harbor mistralrs health
# Call original mistralrs-server CLI
harbor mistralrs --help
# Debug the installation
harbor shell mistralrs
Better linking and aliases
harbor link
is now configurable - choose link names and location.
# Assuming it's not linked yet
# See the defaults
./harbor.sh config get cli.path
./harbor.sh config get cli.name
./harbor.sh config get cli.short
# Customize
./harbor.sh config set cli.path ~/bin
./harbor.sh config set cli.name ai
./harbor.sh config set cli.short ai
# Link
./harbor.sh ln
# Include the short link
./harbor.sh ln --short
# Unlink
harbor unlink
Misc
.nvidia.
files are now implemented with cross-service file behavior- multiple new aliases, see the Wiki and
harbor -h
for reference
Full Changelog: v0.0.15...v0.0.16