Skip to content

v0.4.5

Compare
Choose a tag to compare
@github-actions github-actions released this 19 Dec 15:42
· 665 commits to main since this release

What's Changed

Important

Version 0.4.4 was skipped.

Quite a few changes this time around, most notably:

  • Implement DeciLM by @AlpinDale in #158
  • Support prompt logprobs by @AlpinDale in #162
  • Support safetensors for Mixtral along with expert parallelism for better multi-gpu by @AlpinDale in #167
  • Implement CUDA graphs for better multi-GPU and optimizing smaller models by @AlpinDale in #172
  • Fix peak memory profiling to allow higher gmu values by @AlpinDale in #166
  • Restore compatibility with Python 3.8 and 3.9 by @g4rg in #170
  • Lazily import model classes to avoid import overhead by @AlpinDale in #165
  • Add RoPE scaling support for Mixtral models by @g4rg in #174
  • Make OpenAI API keys optional by @AlpinDale in #176

Full Changelog: v0.4.4...v0.4.5