-
Notifications
You must be signed in to change notification settings - Fork 501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transcript Speaker Detection isn't perfect #1193
Comments
still an issue: https://twitter.com/KrisTemmerman/status/1716507884656427469 |
If we have a way to populate the transcript data locally, I could track down these issues. |
Some more details: #1562 (comment) @themisterholliday I think I can get you a DB dump if you are still interested? |
Yep I'll take a look if you can grab that 👍 |
Emailed ya. Some details: Here is where we actually append the speaker names: website/src/server/transcripts/utils.ts Line 112 in 14ecf7d
And here we filter the flaggings out (less of an issue)
|
Got it 👍 |
So, I'll break this into three issues:
The flags for speaker detection are sticking around in the transcript viewThis can be seen here: https://syntax.fm/show/683/spooky-coding-horror-stories-2023-part-1/transcript To fix this:
I see the first two as still a little "hacky," but getting this right for all occasions seems complicated. Wes or Scott is missing in the entire transcriptThis issue is because speakers are mislabeled (probably while saving the transcript) with "99" as their speaker id.
If we don't filter, the speakers still have names, so they show up just fine in the recent shows. But I'm assuming this was causing an issue on some other shows, so if we have those, I can double-check the filter. Examples:
Scott is mislabeled as AnnouncerCan you provide the show number we were seeing this? I can't find one, but I'm checking a limited subset. |
sweet thanks. The speaker ID of 99 is important, - I forget why though. Ill check tomorrow. I think all of these issues are due to the regex either being too relaxed, or not relaxed enough. I'd have to check, but I don't think I'm saving the speaker's name in the DB, just the speakers number. The problem with our transcript provider is they don't tell you who is 1 or 2, so we have to do that ourselves. |
If I'm following correctly the speaker name is correctly found here (and above): website/src/server/transcripts/utils.ts Line 49 in 14ecf7d
Which accounts for any speaker id in conjunction with Ah yea I remember y'all saying the transcript provider doesn't give the speaker which is why this code is required. |
the incorrect marking is fixed. I'd like to figure a way to map the speaker numbers to show guests now. |
The text was updated successfully, but these errors were encountered: