Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alignment problems with German text? #38

Closed
imdatceleste opened this issue Feb 5, 2018 · 8 comments
Closed

Alignment problems with German text? #38

imdatceleste opened this issue Feb 5, 2018 · 8 comments

Comments

@imdatceleste
Copy link

Hi @r9y9, I'm training on German audio. I have added the german characters (Ä, Ö, Ü, ß, ä, ö, ü) to the symbolset and am using basic_cleaners.

The problem is the alignment on test-audio. Look at some of the samples. And, of course, the audio is horrible too. I have tested with up to 500k steps. Always the same results. When I generate audio with synthesis, I have similar results. Any hints where I'd need to add more info?

step000180000_text4_single_alignment

Thanks for any recommendations... (I converted the German training data to ljspeech format...)

@r9y9
Copy link
Owner

r9y9 commented Feb 6, 2018

Does your training data contain beginning / ending silences? It's better to trim silences before training.

I sometimes got bad results with long audio (for example, see #24). What's the output of the following command?

python compute_timestamp_ratio.py --hparams="your hyper params" ${your_data_path}

print(input_timestamps, output_timestamps, output_timestamps / input_timestamps)

If the output_timestamps / input_timestamps is larger than 2, I'd try to increase outputs_per_step or decrease downsample_step to balance input/outuput lengths, and change the model architecture accordingly.

@imdatceleste
Copy link
Author

imdatceleste commented Feb 6, 2018

The output_timestamps/input_timestamps is 1.43. I'll check the whether I have pauses at begin and end let you know.
EDIT: yes, there were silence at the beginning and at the end. I removed them and will try training again and let you know. Thank you very much.

@imdatceleste
Copy link
Author

@r9y9, it seems this is a general problem, see ##27 -- When I use a test.txt with more than one line, the first entry is generated correctly, the others are just inaudible. Everything is fine until

mel_outputs, linear_outputs, alignments, done = model(...

in synthesis.py:tts(.... But then the call to to model(... returns super-fast and the result is inaudible or actually just no audio at all...

When I start synthesis.py for each entry directly, all of them are generated ok...

I'm investigating what could be going on...

@r9y9
Copy link
Owner

r9y9 commented Feb 6, 2018

Oh, thank you for the report. I can reproduce.... I found a really stupid bug. Fix with tests coming shortly.

r9y9 added a commit that referenced this issue Feb 6, 2018
forgot to clear buffer property

ref #38
@r9y9
Copy link
Owner

r9y9 commented Feb 6, 2018

@imdatsolak I think I fixed the bug. Could you confirm if it works? There's a fix for only incremental inference, so you don't need to re-train your model.

@imdatceleste
Copy link
Author

I'll check and let you know within next 10 minutes :-)

@imdatceleste
Copy link
Author

imdatceleste commented Feb 6, 2018

It seems to work, the alignments look a lot better. Unfortunately, I re-started training :-( and am only at step 30k, so I'll need to continue training over the night and final result should be available tomorrow. Thank you very much. I'll let you know once I have more results...
EDIT:
It works now. When it reaches the save_checkpoint, it generates (correctly) the following alignment-files (before the fix, the first looked good, the others looked like the one at top of this issue):
step000040000_text0_single_alignment

step000040000_text1_single_alignment

step000040000_text2_single_alignment

step000040000_text3_single_alignment

step000040000_text4_single_alignment

Thanks again!! 👍

@imdatceleste
Copy link
Author

@r9y9, your fix works. Thanks again. Closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants