zipformer and SSL feature adaptation #1799

alanshaoTT · 2024-11-12T14:55:51Z

I was wondering if zipformer has any specific modules or functions designed for fbank features? I’m using pretrained wav2vec2.0 representations as input for zipformer training, but I’m having trouble with the model’s loss not converging well. I’m following the librispeech/zipformer recipe, but when I used the same representations with librispeech/pruned_transducer_stateless7, it converged just fine. I noticed the main difference between these recipes is the zipformer encoder. Is there something specifically designed for fbank features in librispeech/zipformer that could be causing this?

marcoyang1998 · 2024-11-12T15:03:30Z

What do you mean by not converging well? Is it having poor WERs?

alanshaoTT · 2024-11-12T16:36:26Z

My model's loss is quite high, fluctuating around 2.0, and it hasn’t decreased much. The WER is also high.

alanshaoTT · 2024-11-13T07:53:20Z

this is my models loss

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

zipformer and SSL feature adaptation #1799

zipformer and SSL feature adaptation #1799

alanshaoTT commented Nov 12, 2024

marcoyang1998 commented Nov 12, 2024

alanshaoTT commented Nov 12, 2024

alanshaoTT commented Nov 13, 2024

zipformer and SSL feature adaptation #1799

zipformer and SSL feature adaptation #1799

Comments

alanshaoTT commented Nov 12, 2024

marcoyang1998 commented Nov 12, 2024

alanshaoTT commented Nov 12, 2024

alanshaoTT commented Nov 13, 2024