Settings

Theme

Show HN: Local audio transcription and speaker ID for Apple Silicon

github.com

2 points by vadiml 5 months ago · 1 comment · 1 min read

Reader

  Built a tool combining MLX Whisper + pyannote for fast local audio transcription with speaker diarization on Apple Silicon.

  Key benefits: privacy-first (fully local), hardware-accelerated, automatic speaker identification, multiple output formats (TXT/SRT/JSON).

  Main technical challenge was making MLX Whisper and pyannote work together despite different audio processing - solved with preprocessing pipeline.

  Perfect for interviews, meetings, podcasts. Handles HuggingFace gated models with proper error handling.
torstenvl 4 months ago

Surprised this didn't get more traction, as it's really interesting.

Is there a reason it's ASi-only? I don't know the technical details of MLX, whether it runs or can be run on other hardware, etc.

Also, why does the HF token need to be in an environment variable and passed on the command line?

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection