diarization multiprocessing on CPUs #1124

MrEdwards007 · 2022-10-28T17:54:09Z

MrEdwards007
Oct 28, 2022

I am very interested in speaker-diarization but I do not have a supported GPU so this runs on my CPU.

I would like to run diarization on an chunks of a single input source, in parallel and stitch the results together.
Can this be done using the below methodology or is there another that would work?

All testing was done on a MacBook, macOS Big Sur using 2.3 GHz Intel Core i9, 16 cores, with 16G of RAM (GPU not supported)
My first target was to learn how long diarization would take under normal circumstances. I ran the process overnight.

My target (testing) file DonQuixote_OneHour is 3706.393 seconds - 01:01:46(H:M:S)
I guessed the process would be at most 1-1.5x the length of the audio on a CPU
The elapsed completion time for DonQuixote_OneHour_audio.rttm was 7:41:12 (H:M:S), which is greater than -7x (closer to -8x)

Once I established how long a typical process ran, I tried to learn if the was a signature]hash for each speaker identified, versus 'speaker 1, 2, 3, etc' If there is a signature, the process can be broken into chunks (I've already done this for transcription, including fixing the concatenated timelines) and run in parallel to bring the completion time down. If there is a signature\hash, each parallel process can be stitched back together, using the signature\hash of the speaker (assuming this would be the same across processes).

How would you advise accelerating on CPUs?
Is there an obtainable hash\signature for each identified user, that can be used across processes?
If thee is not a hash\signature for each identified user, can one be synthesized?

Below is the approach that I took with accelerating Whisper tasks, (specifically transcriptions)
openai/whisper#432

Using 011 of 16CPUs for the "tiny.en" model, a transcription speed (over real time) of 32.713x
Using 007 of 16CPUs for the "base.en" model, a transcription speed (over real time) of 16.416x
Using 009 of 16CPUs for the "small.en" model, a transcription speed (over real time) of 5.595x

Thank you for your time and assistance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

diarization multiprocessing on CPUs #1124

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

diarization multiprocessing on CPUs #1124

MrEdwards007 Oct 28, 2022

Replies: 0 comments

MrEdwards007
Oct 28, 2022