Releases · av/harbor

23 Sep 12:26

v0.1.29

c32c7cb

v0.1.29

Misc

boost now supports standalone usage, without the rest of the harbor

Full Changelog: v0.1.28...v0.1.29

Assets 2

23 Sep 11:02

v0.1.28

a5f00e6

v0.1.28

STT - faster-whisper-service integration

Harbor now has a dedicated stt backend, in addition to the already present tts. Open WebUI will be configured to use it automatically instead of "local" whisper, when running together. The server will use GPU automatically, if possible on the given platform and CPU otherwise.

# Start the service
harbor up stt

# Convigure model/version
harbor stt model Systran/faster-distil-whisper-large-v3
harbor stt version latest

Misc

OpenHands integration, the service is not very configurable atm, with only basic support for Ollama URL, file an issue if that changes in the future!
CLI linter

Full Changelog: v0.1.27...v0.1.28

Assets 2

22 Sep 15:54

v0.1.27

387f806

v0.1.27 - Harbor Boost

RCN Llama 3.1 8B + Web RAG in Open WebUI

Harbor can now boost small llamas to be better at creative and reasoning tasks. I'm happy to present Harbor Boost - optimizing LLM proxy with OpenAI-compatible API.

It allows implementing workflows like below:

When "random" is mentioned in the message, klmbr will rewrite 35% of message characters to increase the entropy and produce more diverse completion
Launch self-reflection reasoning chain when the message ends with a question mark
Expand the conversation context with the "inner monologue" of the model, where it can iterate over your question a few times before giving the final answer
~~Count "r"s in "strawberry"~~ this problem is solved

See how Harbor can boost the ~~creativity~~ _randomness in a small llama beyound the infinite "Turquoise", using klmbr:

Screencast.from.22-09-24.17.41.52.webm

klmbr will process your inputs to inject some randomness into them, so even with 0 temperature - LLM output will be varied (sometimes in a very unexpected way). Harbor allows to configure various parameters of klmbr via both CLI and .env.

You can also use rcn (brand new technique) an g1 CoT to make your llama more reasonable.

This works, essentially, by just giving an LLM more time to "think" about its answer and improves reasoning in many cases at the expense of larger amount of tokens consumed.

Harbor Boost docs

Misc

harbor size - shows the size of caches from Harbor services on your system (we don't recomment running it, it hurts)
harbor bench - better logs with ETA and service pointers, fixed issue with parameter propagation for reproducible results, added BBH256/32 examples
harbor update should now allow updating past 0.1.9 on MacOS (granted you'll manage to update past it in the first place 🙃)

Full Changelog: v0.1.26...v0.1.27

Assets 2

17 Sep 12:50

v0.1.26

e583b80

v0.1.26

v0.1.26 - Run Harbor with external Ollama

It's now possible to configure Harbor to use external Ollama installation. The URL is relative to the container internal network.

# URL is internal to the container network
harbor config get ollama.internal_url

# Suitable default, when running built-in Ollama
harbor url -i ollama # http://ollama:11434

# Linux
# 172.17.0.1 is the IP of your host within the container
harbor config set ollama.internal_url  http://172.17.0.1:33821

# Windows, MacOS
# Should have additional default host out of the box
harbor config set ollama.internal_url http://docker.host.internal:33821

Full Changelog: v0.1.25...v0.1.26

Assets 2

17 Sep 12:25

v0.1.25

d60382b

v0.1.25

v0.1.25 - KTransformers integration

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

🔥 Show Cases | 🚀 Quick Start | 📃 Tutorial | 💬 Discussion

Starting

# [Optional] Pre-build the image
# This is very large, as it's based on pytorch+cuda
# go grab a coffee!
harbor build ktransformers

# Start the service
harbor up ktransformers

Harbor's version was monkey-patched to be compatible with Open WebUI and will appears as ktransformers in the model selector upon successful start.

https://github.com/av/harbor/wiki/ktransformers-webui.png

Full Changelog: v0.1.24...v0.1.25

Assets 2

16 Sep 11:51

v0.1.24

0beecae

v0.1.24

v0.1.24 - "But we have o1 at home!"

Based on the reference work from:

Minimal streamlit-based service with Ollama as a backend, that implements the o1-like reasoning chains.

Starting

# Start the service
harbor up ol1
# Open ol1 in the browser
harbor open ol1

Configuration

# Get/set desired Ollama model for ol1
harbor ol1 model
# Set the temperature
harbor ol1 args set temperature 0.5

ol1 Service docs

Full Changelog: v0.1.23...v0.1.24

Assets 2

15 Sep 14:42

v0.1.23

a801c32

v0.1.23

v0.1.23 - `harbor history`

Harbor remembers a number of most recently executed CLI commands. You can search/re-run the commands via the harbor history command.

This is an addition to the native history in your shell, that'll persist longer and is specific to the Harbor CLI.

Use history.size config option to adjust the number of commands stored in the history.

# Set current history size
harbor history size 50

History is stored in the .history file in the Harbor workspace, you can also edit/access it manually.

# Using a built-in helper
harbor history ls | grep ollama
# Manually, using the file
cat $(harbor home)/.history | grep ollama

You can clear the history with the harbor history clear command.

# Clear the history
harbor history clear
# Empty
harbor history

Full Changelog: v0.1.22...v0.1.23

Assets 2

14 Sep 21:30

v0.1.22

89c1645

v0.1.22

v0.1.22 - JupyterLab intergration

# [Optional] pre-build the image
harbor build jupyter

# Start the service
harbor up jupyter

# Open JupyterLab in the browser
harbor open jupyter

Your notebooks are stored in the Harbor workspace, under the jupyter directory.

# Opens workspace folder in the File Mangager
harbor jupyter workspace

# See workspace location,
# relative to $(harbor home)
harbor config get juptyer.workspace

Additionally, you can configure service to install additional packages.

# See deps help
# It's a manager for underlying array
harbor jupyter deps -h

# Add packages to install, supports the same
# specifier syntax as pip
harbor jupyter deps add numpy
harobr jupyter deps add SomeProject@git+https://git.repo/[email protected]
harbor jupyter deps add SomePackage[PDF,EPUB]==3.1.4

Service docs

Full Changelog: v0.1.21...v0.1.22

Assets 2

14 Sep 12:02

v0.1.21

ed7b2ae

v0.1.21

v0.1.21 - Harbor profiles

Profiles is a way to save/load a complete configuration for the specific task. For example, to quickly switch between the models that take a few commands to configure. Profiles include all options that can be set via harbor config (which is aliased by most of the CLI helpers).

Usage

harbor
  profile|profiles|p [ls|rm|add] - Manage Harbor profiles
    profile ls|list             - List all profiles
    profile rm|remove <name>    - Remove a profile
    profile add|save <name>     - Add current config as a profile
    profile set|use|load <name> - Use a profile

There are a few considerations when using profiles:

When the profile is loaded, modifications are not saved by default and will be lost when switching to another profile (or reloading the current one). Use harbor profile save <name> to persist the changes after making them
Profiles are stored in the Harbor workspace and can be shared between different Harbor instances
Profiles are not versioned and are not guaranteed to work between different Harbor versions
You can also edit profiles as .env files in the workspace, it's not necessary to use the CLI

Example

# 1. Switch to the default for a "clean" state
harbor profile use default

# 2. Configure services as needed
harbor defaults remove ollama
harbor defaults add llamacpp
harbor llamacpp model https://huggingface.co/lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF/blob/main/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf
harbor llamacpp args -ngl 99 --ctx-size 8192 -np 4 -ctk q8_0 -ctv q8_0 -fa

# 3. Save profile for future use
harbor profile add cpp8b

# 4. Up - runs in the background
harbor up

# 5. Adjust args - no parallelism, no kv quantization, no flash attention
# These changes are not saved in "cpp8b"
harbor llamacpp args -ngl 99 --ctx-size 2048

# 6. Save another profile
harbor profile add cpp8b-smart

# 7. Restart with "smart" settings
harbor profile use cpp8b-smart
harbor restart llamacpp

# 8. Switch between created profiles
harbor profile use default
harbor profile use cpp8b-smart
harbor profile use cpp8b

Full Changelog: v0.1.20...v0.1.21

Assets 2

13 Sep 14:11

v0.1.20

988f8b9

v0.1.20

v0.1.20 - SGLang integration

SGLang is a fast serving framework for large language models and vision language models.

Starting

# [Optional] Pre-pull the image
harbor pull sglang

# Download with HF CLI
harbor hf download google/gemma-2-2b-it

# Set the model to run using HF specifier
harbor sglang model google/gemma-2-2b-it

# See original CLI help for available options
harbor run sglang --help

# Set the extra arguments via "harbor args"
harbor sglang args --context-length 2048 --disable-cuda-graph

Full Changelog: v0.1.19...v0.1.20

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Misc

STT - faster-whisper-service integration

Misc

v0.1.27 - Harbor Boost

Misc

v0.1.26 - Run Harbor with external Ollama

v0.1.25 - KTransformers integration

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Starting

v0.1.24 - "But we have o1 at home!"

Starting

Configuration

v0.1.23 - `harbor history`

v0.1.22 - JupyterLab intergration

v0.1.21 - Harbor profiles

Usage

Example

v0.1.20 - SGLang integration

Starting

Releases: av/harbor

v0.1.29

Misc

v0.1.28

STT - faster-whisper-service integration

Misc

v0.1.27 - Harbor Boost

v0.1.27 - Harbor Boost

Misc

v0.1.26

v0.1.26 - Run Harbor with external Ollama

v0.1.25

v0.1.25 - KTransformers integration

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Starting

v0.1.24

v0.1.24 - "But we have o1 at home!"

Starting

Configuration

v0.1.23

v0.1.23 - harbor history

v0.1.22

v0.1.22 - JupyterLab intergration

v0.1.21

v0.1.21 - Harbor profiles

Usage

Example

v0.1.20

v0.1.20 - SGLang integration

Starting

v0.1.23 - `harbor history`