Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zipformer and SSL feature adaptation #1799

Open
alanshaoTT opened this issue Nov 12, 2024 · 3 comments
Open

zipformer and SSL feature adaptation #1799

alanshaoTT opened this issue Nov 12, 2024 · 3 comments

Comments

@alanshaoTT
Copy link

I was wondering if zipformer has any specific modules or functions designed for fbank features? I’m using pretrained wav2vec2.0 representations as input for zipformer training, but I’m having trouble with the model’s loss not converging well. I’m following the librispeech/zipformer recipe, but when I used the same representations with librispeech/pruned_transducer_stateless7, it converged just fine. I noticed the main difference between these recipes is the zipformer encoder. Is there something specifically designed for fbank features in librispeech/zipformer that could be causing this?

@marcoyang1998
Copy link
Collaborator

What do you mean by not converging well? Is it having poor WERs?

@alanshaoTT
Copy link
Author

My model's loss is quite high, fluctuating around 2.0, and it hasn’t decreased much. The WER is also high.

@alanshaoTT
Copy link
Author

image
this is my models loss

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants