-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: Decoding with Zamia Speech's German wav2letter model using wav2letter Decoder executable #104
Comments
check out
the script I used to run the decoder was based on this template. Please be aware that it is quite likely w2l has moved on from the state it was in back when I trained that model and used it, command line and/or file formats may have changed since then. |
Thank you for your fast answer! Do you remember which commit of wav2letter you used when you trained and tested the model? One more thing, do you remember which language model you used? I am assuming you used larger model of order 6 with less pruning. I am trying to test your model with my audio files with the exact configuration you used to achieve your reported result in here. |
@realbaker1967 I used the same decoder configuration as in the template file, as well as a the order 6 LM. Unfortunately, I cannot reproduce the reported WER of 3.97%. Its probably due to the update of w2l I gues... |
In my case, the model was decoding good except the beginnings and the endings of the audio files. For example: Annotation: Sie pflegten die Kranken und verbanden die Verwundeten. Note that the word Sie is omitted and en is added. I observe these two problems very frequently, especially adding non existing words at the ends. Did you observe similar problems @lagidigu ? |
@realbaker1967 I get the same results after applying the template. This is strange. I will have to look into how the beam search decoder works exactly and will report back whether I made any progress. |
@realbaker1967 unfortunately I couldn't troubleshoot the issue. @gooofy do you know what might have changed with the decoder? The WER is a lot higher than 3.97%, unfortunately :/ |
@lagidigu no idea what exactly has changed but as I mentioned earlier I am not surprised wav2letter has moved on from the state it was in when I made my experiments. Actually, I suspect it is good news wav2letter continues to be developed and improved. If you're serious about wav2letter I would suggest you train your own model from scratch using their current codebase - all training material from zamia speech is freely available as are the scripts used to train the model so that should give you a head start. |
@gooofy I am interested to do only a benchmark. So no need to train from scratch. For that, it would be really good if you could provide us which commit of wav2letter you used, if it would be possible of course. In that case, i could safely run the decoder with your given template. Thanks |
First of all, really nice work!
I am interested in your German acoustic model for a benchmark.
I assume based on here you suggest to use wav2letter's Decoder executable to decode audios with your German acoustic model.
If we would use that executable from wav2letter, we would need to tune certain set of parameters which they mention in their Decoder executable explanation.
If possible, could you please share with us your tuned parameters for decoding?
Or, do we need to use the parameters in w2l_run_decode.sh.template?
Regards
The text was updated successfully, but these errors were encountered: