Intel® auto-round v0.3 Release
-
Highlights:
- Broader Device Support:
- Expanded support for CPU, HPU, and CUDA inference in the AutoRound format, resolving the 2-bit accuracy issue.
- New Recipes and Model Releases:
- Published numerous recipes on the Low Bit Open LLM Leaderboard, showcasing impressive results on LLaMa 3.1 and other leading models.
- Experimental Features:
- Introduced several experimental features, including activation quantization and
mx_fp
, with promising outcomes with AutoRound.
- Introduced several experimental features, including activation quantization and
- Multimodal Model Support:
- Extended capabilities for tuning and inference across several multimodal models.
Lowlights:
- Implemented support for
low_cpu_mem_usage
,auto_awq
format, calibration dataset concatenation, and calibration datasets with chat templates.
- Broader Device Support: