Replies: 2 comments
-
Yeah that would be a great use case for whisper. |
Beta Was this translation helpful? Give feedback.
0 replies
-
+1 it would be great |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Would it be possible to support SSML (Speech Synthesis Markup Language) tags generation to indicate the "prosody" for every word / sentence? I.e. if the person talks fast, slow, loud, high pitch / low pitch – with SSML integration we would have that data recorded.
This would be very useful when it comes to translating one language to another with the goal of retaining the audio context so that the further TTS translation wouldn't sound like soulless wall of text read by the machine. Is it possible with Whisper?
Beta Was this translation helpful? Give feedback.
All reactions