Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get Audio file via Command #156

Open
miguelgh65 opened this issue Nov 17, 2022 · 2 comments
Open

Get Audio file via Command #156

miguelgh65 opened this issue Nov 17, 2022 · 2 comments

Comments

@miguelgh65
Copy link

I use this command to get an audio file from cli but here the audio file Im getting is not saying anything at all, via GUI it works perfectly.
Im using a more extended alphabet than english
python synthesis/synthesize.py -m data/models/..... -vm "data/hifigan/vocoder/model.pt" -hc "data/hifigan/vocoder/config.json" -t "test this is a test" -a audio.wav
Any idea?
Thanks in advance

@scheissegalo
Copy link

I have the same issue and looking for an answer. I think the problem here is when using the synthesize.py there is no option to add the language model to. Default is english and no docs pointing on how to load the language model you made with the GUI.

synthesize.py:
from training import DEFAULT_ALPHABET ......... symbols=DEFAULT_ALPHABET

@scheissegalo
Copy link

Fixed by adding following code to the import section of synthesize.py:

from dataset.transcribe import Silero, DeepSpeech, SILERO_LANGUAGES
from main import app, paths
from training.utils import (
    get_available_memory,
    get_gpu_memory,
    get_batch_size,
    load_symbols,
    generate_timelapse_gif,
    create_trainlist_vallist_files,
)

TRANSCRIPTION_MODEL = "model.pbmm"
ALPHABET_FOLDER = "alphabets"
ALPHABET_FILE = "alphabet.txt"

def get_symbols(language):
    if language in SILERO_LANGUAGES:
        return load_symbols(os.path.join(ALPHABET_FOLDER, f"{language}.txt"))
    else:
        return load_symbols(os.path.join(paths["languages"], language, ALPHABET_FILE))

then you can add your language generated from gui into "def synthesize":

def synthesize(
    model,
    text,
    #symbols=DEFAULT_ALPHABET,
    symbols = get_symbols("<name of your language>"),
    graph_path=None,
    audio_path=None,
    vocoder=None,
    silence_padding=0.15,
    sample_rate=22050,
    max_decoder_steps=1000,
    split_text=False,
):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants