Skip to content

v1.13.2: Patch release

Latest
Compare
Choose a tag to compare
@regisss regisss released this 06 Sep 20:17
· 98 commits to main since this release

Llava(-next) improvements

This patch release adds multi-card support for Llava(-next) and enables users to turn on/off recomputing for flash attention.

  • Llava: Added flash_attention_recompute arg to provide an option to enable/disable recompute #1278 @tthakkal
  • Add the deepspeed injection_policy of mistral #1309 @yuanwu2017

Full Changelog: v1.13.1...v1.13.2