Settings

Theme

Ask HN: What do you use for speaker diarization?

2 points by mjbale116 a month ago · 1 comment · 1 min read

Reader

Hi,

I am looking for a fire and forget solution akin to whisper where I can give it a wav of around 12 people and it can give me a diarization on the format (speaker_1, speaker_2, etc)

whispercpp gives labels like speaker_turn which is not what I am looking for, I need to know who said what

nvidia nemo only works with 4 speakers and unfortunately is not good enough for me

Do you have an open source solution that you can suggest? Or a potential pipeline?

Much appreciated!

AlexeyBrin a month ago

WhisperX with pyannote, but it is not perfect, sometime for the same speaker you will get multiple labels.

There is no open source fire and forget solution as far as I know.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection