Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Transscribing Media ends with exlamation marks #365

Open
csPinKie opened this issue Nov 10, 2024 · 4 comments
Open

Bug: Transscribing Media ends with exlamation marks #365

csPinKie opened this issue Nov 10, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@csPinKie
Copy link

csPinKie commented Nov 10, 2024

What happened?

The transcript of a 1h multi speaker file generates the following output:
00:00 --> 01:20
Speaker 1:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
01:20 --> 01:28
Speaker 1:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
01:28 --> 01:39
Speaker 1:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
01:40 --> 01:41
Speaker 1:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
01:43 --> 01:44
Speaker 1:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
01:44 --> 01:54
Speaker 1:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
01:54 --> 01:57

Steps to reproduce

  1. step one, load a file larger than 1h into the app
  2. step two, set speaker amount to 8, language german
  3. start transcription
    I use a Amd 7700XT, maybe thats the reason

What OS are you seeing the problem on?

Window

Relevant log output

App Version: vibe 2.6.3
Commit Hash: d24ffccb0d05ea822ff1a3a6edb3b9871be9f368
Arch: x86_64
Platform: windows
Kernel Version: 10.0.19045
OS: windows
OS Version: 10.0.19045
Cuda Version: n/a
Models: ggml-medium.bin
Default Model: "C:\\Users\\Me\\AppData\\Local\\github.com.thewh1teagle.vibe\\ggml-medium.bin"
Cargo features: vulkan


{
    "avx": {
        "enabled": true,
        "support": true
    },
    "avx2": {
        "enabled": true,
        "support": true
    },
    "f16c": {
        "enabled": true,
        "support": true
    },
    "fma": {
        "enabled": true,
        "support": true
    }
}
@thewh1teagle
Copy link
Owner

Please show me example youtube video that it happens with or upload audio and show me what language to choose so I can reproduce it

@csPinKie
Copy link
Author

Hi, the language doesnt really matter, whether i chose "auto detect language", "german" or "english", its all excamation marks.

Regarding the audio and video: also doesnt matter in my case, different files / formats all resulted in the same problem.
I even changed from AMD Pro drivers to Gaming drivers, nothing changed.
I am sure you will be able to transcribe anything fine, just like I am on the CPU model ( except that its really slow)
Anything else I can provide to help?

@thewh1teagle
Copy link
Owner

Maybe related to ggerganov/whisper.cpp#2400

@dusanpol
Copy link

I have the same issue for transcribing audio clips longer than ~8 seconds. Vulkan build, 7900XTX, Windows 10.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants