Ask HN: What do you use for speaker diarization?

2 points by mjbale116 7 months ago · 1 comment · 1 min read

Hi,

I am looking for a fire and forget solution akin to whisper where I can give it a wav of around 12 people and it can give me a diarization on the format (speaker_1, speaker_2, etc)

whispercpp gives labels like speaker_turn which is not what I am looking for, I need to know who said what

nvidia nemo only works with 4 speakers and unfortunately is not good enough for me

Do you have an open source solution that you can suggest? Or a potential pipeline?

Much appreciated!

AlexeyBrin 7 months ago

WhisperX with pyannote, but it is not perfect, sometime for the same speaker you will get multiple labels.

There is no open source fire and forget solution as far as I know.

Settings

Ask HN: What do you use for speaker diarization?

Keyboard Shortcuts