Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding columns to understand Model Performance #24

Open
trivikramak opened this issue Jan 30, 2024 · 5 comments
Open

Adding columns to understand Model Performance #24

trivikramak opened this issue Jan 30, 2024 · 5 comments

Comments

@trivikramak
Copy link

Can we add some metrics or a relative scoring mechanism or something like that to understand how good the models are?

@Vaibhavs10
Copy link
Owner

@Pendrokar is doing that here: #19

@Pendrokar
Copy link
Contributor

Pendrokar commented Jan 30, 2024

With "Real-time factor below threshold" being that relative scoring for CPU. For GPU acceleration it would be too hard to tell. Best to just visit the Repo of the specific TTS or test the 🤗 Space and see if they provide the data.

@Pendrokar
Copy link
Contributor

@trivikramak The pull request for adding the capability table has been accepted. Feel free to correct information within it with a PR. I judged the capability from HF Spaces. If a processor column is empty, then that means the RTF was quick with CUDA, but I could not determine if it would have been as fast with just the CPU.
https://github.com/Vaibhavs10/open-tts-tracker/blob/main/README.md#capability-specifics

@Pendrokar
Copy link
Contributor

@fakerybakery I noticed you've added CPU as processor for StyleTTS under the capabilities table, but when I cloned the space of it on HF, I got a RTF of around 7.0. 🤔 Now I did choose "RTF below 2.0" condition arbitrarily as any TTS is capable of processing the audio with CPU. Just the question arises if there is a point to including those whose RTF is 5, 10, 20 with CPU. If so, all cells would have CPU as the processor... @Vaibhavs10 thoughts?

I see an RTF of 2.0 as a good target as a TTS with streaming support would be able to playback the audio once it gets passed half-way of processing the audio.

@fakerybakery
Copy link
Contributor

@Pendrokar I think it also depends on which CPU your using. On a MacBook I get much higher speeds, free HF Spaces have a quite basic CPU

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants