Skip to content

v0.0.16

Compare
Choose a tag to compare
@av av released this 04 Aug 15:39
· 291 commits to main since this release

mistral.rs integration

Handle: mistralrs
URL: http://localhost:33951

Mistral.rs is a fast LLM inference platform supporting inference on a variety of devices, quantization, and easy-to-use application with an Open-AI API compatible HTTP server and Python bindings.

Defaults

# Spin up harbor with mistral.rs service
harbor up mistralrs

# Launch the default UI, pre-configured with mistral.rs
harbor open

Plain Models

# For "plain" models:
# Download the model to the global HF cache
harbor hf download IlyaGusev/gemma-2-2b-it-abliterated

# Set model/type/arch
harbor mistralrs model IlyaGusev/gemma-2-2b-it-abliterated
harbor mistralrs type plain
harbor mistralrs arch gemma2

# Gemma 2 doesnt't support paged attention
harbor mistralrs args --no-paged-attn

# Run with the default settings
harbor up mistralrs

GGUF Models

# Set the model type to GGUF
harbor mistralrs type gguf

# - Unset ISQ off, as it's not supported
# for GGUF models
# - For GGUFs, architecture is inferred from the file
harbor mistralrs isq ""
harbor mistralrs arch ""

# Use "folder" specifier to point to the model
# "hf/full/path"  - mounted HF cache. Note that you need
#                   a full path to the folder with .gguf
# "-f Model.gguf" - the model file
harbor mistralrs model "hf/hub/models--microsoft--Phi-3-mini-4k-instruct-gguf/snapshots/999f761fe19e26cf1a339a5ec5f9f201301cbb83/ -f Phi-3-mini-4k-instruct-q4.gguf"

# When configured, launch
harbor up mistralrs

CLI

# Show service API docs (local, when running)
harbor mistralrs docs
# Check if the service is health
harbor mistralrs health
# Call original mistralrs-server CLI
harbor mistralrs --help
# Debug the installation
harbor shell mistralrs

Better linking and aliases

harbor link is now configurable - choose link names and location.

# Assuming it's not linked yet

# See the defaults
./harbor.sh config get cli.path
./harbor.sh config get cli.name
./harbor.sh config get cli.short

# Customize
./harbor.sh config set cli.path ~/bin
./harbor.sh config set cli.name ai
./harbor.sh config set cli.short ai

# Link
./harbor.sh ln
# Include the short link
./harbor.sh ln --short
# Unlink
harbor unlink

Misc

  • .nvidia. files are now implemented with cross-service file behavior
  • multiple new aliases, see the Wiki and harbor -h for reference

Full Changelog: v0.0.15...v0.0.16