You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was wondering if zipformer has any specific modules or functions designed for fbank features? I’m using pretrained wav2vec2.0 representations as input for zipformer training, but I’m having trouble with the model’s loss not converging well. I’m following the librispeech/zipformer recipe, but when I used the same representations with librispeech/pruned_transducer_stateless7, it converged just fine. I noticed the main difference between these recipes is the zipformer encoder. Is there something specifically designed for fbank features in librispeech/zipformer that could be causing this?
The text was updated successfully, but these errors were encountered:
I was wondering if zipformer has any specific modules or functions designed for fbank features? I’m using pretrained wav2vec2.0 representations as input for zipformer training, but I’m having trouble with the model’s loss not converging well. I’m following the librispeech/zipformer recipe, but when I used the same representations with librispeech/pruned_transducer_stateless7, it converged just fine. I noticed the main difference between these recipes is the zipformer encoder. Is there something specifically designed for fbank features in librispeech/zipformer that could be causing this?
The text was updated successfully, but these errors were encountered: