Show HN: FreeFlow – Open-Source Wispr Flow
github.comHi HN!
Voice is fast-becoming my primary interface to computers and AI. I built FreeFlow because I wanted a Wispr Flow-like experience for our entire team, but customizable and private.
Press a hotkey, dictate naturally, polished text appears in any app. Ramble, use filler words, correct yourself mid-sentence. FreeFlow turns messy speech into clean writing and injects it wherever your cursor is: your messaging app, your editor, your coding agent, the terminal, email, anything.
Demo (sound on): https://github.com/build-trust/freeflow#demo-sound-on-
It's really fast. The injection feels instantaneous. In my benchmarks two thirds of dictations finish in under 0.6 seconds. To get that speed, the app streams audio to your private server over a persistent WebSocket while you speak, and a realtime speech-to-text model transcribes incrementally, so by the time you release the key the transcript is mostly done. Two independent WebSocket connections race each other, and if both fail, an HTTP batch fallback catches it. The transcript goes through a post-processing step that removes filler words and fixes grammar. About 40% of dictations are clean enough to skip this step entirely. When post-processing is needed, a fast model handles it in about 0.4 seconds.
It's designed to be taken apart and reassembled. You can swap the speech model, rewrite the prompts, add new languages, or fork the entire experience to fit how your team works. I'm hoping people will morph it into other products.
The FreeFlow service is open source. You can self-host it, but running a low-latency streaming dictation service for a team is real infrastructure work: persistent WebSocket connections, streaming routes to speech models, failover, rate limits. At a company with fifty or five hundred people, keeping that reliable is a job in itself. FreeFlow uses Autonomy to make this easy. On first launch, the macOS app deploys the service to a private server. Two minutes, no infrastructure knowledge needed. You can then invite your team. One server handles everyone, no per-seat fees. It sustains thousands of simultaneous streaming connections. In a stress test, 50 people dictating at the same time got sub-second latency with zero failures.
brew install build-trust/freeflow/freeflow
It's macOS only for now, but I plan to build for other operating systems. The two most useful contributions right now are mic compatibility data (every mic behaves differently) and prompts that improve polish quality for a specific language.Try it, tell me how it works with your mic and your apps. What's fast, what's slow, what's broken.
GitHub: https://github.com/build-trust/freeflow How come you don't show the realtime transcription... in realtime? I think it would make it feel even faster. > the UX difference between streaming and offline STT is night and day. Words appearing while you're still talking completely changes the feedback loop. You catch errors in real time, you can adjust what you're saying mid-sentence, and the whole thing feels more natural. Going back to "record then wait" feels broken after that. I think realtime transcription hurts the UX of polishing what's said worse. In FreeFlow the output of the transcription is fed to an LLM to polish in context of where the text is being injected. This way we can go beyond naive transcription. FreeFlow already feels extremely fast and text being typed as I dictate is distracting especially if the polishing phase edits it. I would delay polishing until right before delivery. Eventually, I will add a polishing step to my own https://rift-transcription.vercel.app. Right now, you can experience what true realtime streaming transcription feels like. I plan to add two "levels" of polishing: - Simple deterministic text replacements will be applied to both interim and final text. - LLM polishing will only be applied right before delivery. - It will be possible to undo one or both polishing steps. (Actually even more fine-grained undo: at the replacement rule level). That said, FreeFlow is open source for exactly this reason, everyone will have their own preference. If you would like to turn this behavior into a configurable preference, we'd happily accept a pull request.