Skip to content

v0.9.3

Compare
Choose a tag to compare
@OlivierDehaene OlivierDehaene released this 18 Jul 16:53
· 793 commits to main since this release
5e6ddfd

Highlights

  • server: add support for flash attention v2
  • server: add support for llamav2

Features

  • launcher: add debug logs
  • server: rework the quantization to support all models

Full Changelog: v0.9.2...v0.9.3