Release v1.11: SDXL fine-tuning, Whisper, Phi, ControlNet · huggingface/optimum-habana

SynapseAI v1.15

The codebase is fully validated for the latest version of Habana SDK, SynapseAI v1.15.0.

Upgrade to SynapseAI 1.15.0 #831 @regisss

SDXL fine-tuning

SDXL fine tuning #667 @dsocek
Mediapipe sdxl #787 @ssarkar2

Whisper

Support speech recognition with whisper models and seq2seq #704 @emascarenhas

Phi

Enable phi series models #732 @lkk12014402

ControlNet

Controlnet training #650 @vidyasiv

Transformers v4.38

The codebase is fully validated for Transformers v4.38.

Upgrade to Transformers 4.38 #788 @regisss

Model optimizations

Add optimization for blip text model generation #653 @sywangyi
Enable internal kv bucket in llama #720 @xt574chen
Enable Mixtral-8x7B #739 @jychen-habana
Update Mixtral-8x7B fp8 hqt example #756 @jychen-habana
Further fixes for performance with internal bucketing #781 @puneeshkhanna
speecht5 optimization #722 @sywangyi
move img_mask@get_attn_mask() to hpu #795 @hsubramony
Mistral optimizations #804 @ssarkar2

Image-to-text and VQA examples

Add image-to-text and visual question answering example #738 @sywangyi

torch.compile

Enable torch_compile mode for distributed #659 @kalyanjk
Fix graph breaks in torch compile mode #806 @hlahkar
Fix torch.compile for text generation #811 @regisss
Add Llama7b FSDP test for torch.compile mode #818 @pankd

Bug fixes

Fix beamsearch crash and incorrect output in decode-only model and encode-decode model #627 @sywangyi
Fix translation models #710 @vidyasiv
Fix throughput calculation for diffusion models #715 @skavulya
Fix crash in llama mode in llava image-to-text generation #755 @sywangyi
Fix backward error in DDP when running reward model finetune in RLHF #507 @sywangyi
Fix get_dtype and convert_into_dtypes #769 @regisss
Override sdpa option in Gaudi #771 @jiminha
Fix Llama-70B-FSDP model loading issue #752 @hlahkar
Fix FSDP in transformer4.38 #812 @libinta
Delay importing deepspeed comm due for perf #810 @jiminha
Fix llama rotary pos emb issue for transformers 4.38 #813 @libinta
Fix torch.full issue below when running deepspeed z3 for llama #820 @libinta
Fix profile issue with 1st step #837 @libinta
Fix mistral after syn1.15 update #858 @ssarkar2

Others

Small test_text_generation_example.py refacto #725 @regisss
Update README, add PPO support #721 @sywangyi
Update the Mistral model naming #726 @yafshar
Changing backend name #708 @vivekgoe
Update ppo_trainer.py #718 @skaulintel
Add seed in sft example, make sft result reproducable #735 @sywangyi
Adding a flag whether to save checkpoint or not in run_lora_clm.py #736 @yeonsily
Refactor and update CI for encoder-decoders #742 @regisss
Expose Llama Fused OPs control from run_lora_clm.py #751 @hlahkar
Fixing tests by making static_shapes False #778 @bhargaveede
Fix ControlNet README #785 @regisss
Workaround for RoPE computed in bf16 for GPT-NeoX #746 @regisss
Add Whisper and SpeechT5 to model table #790 @regisss
Update summarization example README #791 @srajabos
Block torchscript pytest because of seg fault issue #793 @yeonsily
Fix test_encoder_decoder.py for opus-mt-zh-en #798 @regisss
Replacing obsolete API for mediapipe #796 @MohitIntel
Add --distribution_strategy fast_ddp in contrastive-image-text README and BridgeTower test #799 @regisss
Fix redundant bucket internal and hpu graph setting #797 @puneeshkhanna
Add Llama test for fsdp #761 @hlahkar
Enable dynamic shapes for esmfold #803 @hsubramony
Add Llama/Llama2 support in Question-Answering #745 @kplau1128
Update MLM example #830 @regisss
Revert Wav2Vec2 TDNNLayer forward function same as transformer v4.37.2 #827 @yeonsily
Save CI test output image #835 @MohitIntel
Update ckpt loading #773 @schoi-habana
Skip SDXL test in CI #840 @regisss
Fix FSDP test on Gaudi1 #841 @regisss
Remove installation from source for Diffusers in CI #846 @regisss
Fix fp8 ci #852 @regisss
Fix PR #848 #853 @regisss
Disable safe loading tests in CI #854 @regisss
Add warmup for eval #855 @libinta

Known issue

A crash may occur with unify_measurements.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.11: SDXL fine-tuning, Whisper, Phi, ControlNet

SynapseAI v1.15

SDXL fine-tuning

Whisper

Phi

ControlNet

Transformers v4.38

Model optimizations

Image-to-text and VQA examples

torch.compile

Bug fixes

Others

Known issue

Contributors