Show HN: AlgoMommy — Organize video clips by talking while recording (macOS)

1 min read Original article ↗

Demo (48s)

Prefer a direct file? Raw MP4

How it works

You speak natural language instructions anywhere in the clip (wake phrase: “Hey Cleo”).
AlgoMommy extracts audio locally and transcribes locally.
It sends only short text snippets (typically up to ~30 seconds of speech) for segments that appear to address Cleo, plus a list of destination folder paths under the root you choose.
The service decides where the clip should be copied and what tags/metadata to apply.

Why “Cleo”

WhisperKit is accurate but too slow to transcribe an entire long clip end-to-end.
Apple SpeechAnalyzer is much faster but less accurate.
AlgoMommy uses SpeechAnalyzer to quickly locate likely “Cleo-addressed” segments, then re-transcribes just those segments with WhisperKit for accuracy.