GitHub - drcursor/HushScribe: Local meeting transcription for macOS. No cloud services or API keys. Your data never leaves your computer.

Local-only meeting transcription for macOS.
No cloud services or API keys. Your data never leaves your computer.

Overview

HushScribe is a macOS menu bar app that captures meetings and voice memos, transcribes them on-device, and writes structured .md files to a folder of your choice (eg. your Obsidian vault).

Every step runs locally. Transcription uses on-device models (Parakeet-TDT v3, WhisperKit, or Apple Speech). AI summaries are generated on-device via Qwen3, Gemma 3, Gemma 4, or Apple's NaturalLanguage framework.

No audio, no transcripts, and no data of any kind is ever sent to the internet.

Installing

Via Homebrew (recommended):

brew tap drcursor/hushscribe https://github.com/drcursor/HushScribe
brew install --cask hushscribe

Manual: Download the DMG from the latest release and drag HushScribe to /Applications.

Why HushScribe?

Entirely local. Transcription and AI summary both run on-device — Parakeet, WhisperKit, Apple Speech, Qwen3, Gemma 3, Gemma 4, and Apple's NaturalLanguage framework. Nothing ever leaves your computer.
Your data, your files. Output is plain .md with YAML frontmatter, timestamps, and speaker labels. No proprietary export, no lock-in, no copy-paste.
No accounts, no subscriptions, no API keys, no additional background services Download and run.

speak → capture → md transcription → optional summary → knowledge base

Features

Multilingual transcription via Parakeet-TDT v3 (FluidAudio) — 25 European languages, auto-detected, runs on Apple Silicon ANE.
Multiple transcription models. Choose between Parakeet-TDT v3 (default, fastest), WhisperKit Base, WhisperKit Large v3, or Apple Speech (built-in, no download required). All run entirely on-device.
Auto-record meetings. Enable from the menu bar — recording starts automatically when a meeting app (Zoom, Teams, Slack, FaceTime, Webex, Discord, Google Meet, Loom) is running and the microphone is actively in use. Stops automatically when the call ends; configurable stop delay in Settings. A white dot appears on the menu bar icon when the feature is active. Note: browser-based meetings (e.g. Google Meet in a browser) are not detected.
Call Capture grabs mic + system audio. Detects which conferencing app you're in (Teams, Zoom, Slack, etc.) and filters audio to just that app.
Voice Memo is mic-only. Saves to a separate folder so it doesn't clutter your meeting transcripts.
On-device AI summary. Open any transcript in the Transcript Viewer and click "Generate Summary" to get Highlights and To-Dos. Choose from Qwen3 0.6B, Gemma 3 1B, Gemma 4 E4B (all downloadable), or the built-in Apple NaturalLanguage model. All run entirely on-device — no API key, no network. Supports custom system prompts (shown as a drop menu on the Generate button when configured). Summaries auto-load when reopening a transcript.
Transcript Viewer. Collapsible Browse sidebar lists all saved transcripts grouped by date, with a resizable divider. Transcripts with a saved summary show a sparkle badge. Export to Markdown, SRT, or JSON.
Transcribe File. Load any audio or video file (M4A, MP4, MOV, MP3, WAV, …) for offline transcription. The file runs through the same VAD → ASR → diarization pipeline as a live session; a speaker-naming prompt appears at the end.
Speaker diarization runs after the call ends. Splits remote audio into labelled speakers; post-session prompt lets you assign real names.
Split VU meters. Separate level meters for microphone and system audio, each with an independent mute toggle.
Obsidian Vault-native compatible. Writes .md with frontmatter: type, created, attendees, tags, source_app.
Silence auto-stop. Configurable timeout (default 2 min); countdown shown during recording.
Privacy mode. Hidden from screen sharing by default. No audio saved to disk — transcripts only.

Privacy

All transcription models run entirely on-device. No audio is ever sent anywhere.
AI summaries are generated on-device (Qwen3, Gemma 3, Gemma 4, or Apple NL). No external API, no network.
No network calls. No analytics. No telemetry.
No audio is saved to disk. Only text transcripts.
The app window is hidden from screen sharing by default.
Transcripts are saved as plain .md files to a folder you choose.

Models

Transcription

Model	Engine	Size
Parakeet-TDT v3 (default)	FluidAudio	~600 MB
Whisper Base	WhisperKit	~150 MB
Whisper Large v3	WhisperKit	~1.5 GB
Apple Speech	macOS built-in	Built-in

AI Summary

Model	Provider	Size
Apple NL (default)	macOS built-in	Built-in
Qwen3 0.6B	Alibaba Cloud	~500 MB
Gemma 3 1B	Google	~600 MB
Gemma 4 E4B	Google	~800 MB

All models run entirely on-device via Apple Silicon. No API key, no network.

Comparison

Feature	Granola	OpenOats	HushScribe
Primary Goal	Automated, structured notes	Real-time assistance / Retrieval	Private, local transcription
Data Processing	Cloud-based (AI processing)	Hybrid	100% Local / Offline
Integrations	Slack, HubSpot, Notion, etc.	Local folders / Markdown	Local Markdown files
Platform	Windows, macOS, iOS	macOS	macOS

Known Limitations and Issues

Apple Silicon only. Parakeet and FluidAudio need Metal / ANE. No Intel.
macOS 26+ only.
Screen Recording re-prompts monthly. OS limitation.
Diarization is imperfect. Works well with headset mics. Laptop speakers with crosstalk will give worse speaker separation.
No live speaker labels. Diarization runs after the session ends.
Microphone input may stop working. If no audio is captured, switching to a specific input device in Settings → Recording (instead of "System Default") usually resolves it. Additonally you can triple click the App's logo to reset the recording process.
Local sound input sometimes fails. System audio capture may silently stop capturing. Changing the input device and restarting the recording session fixes it.
Auto-record meetings detction only for Applications Browser-based meetings (e.g. Google Meet in a browser) are not detected.

Output

---
type: meeting
created: "2026-03-23"
time: "10:00"
duration: "18:42"
source_app: "Zoom"
attendees: ["You", "Speaker 2"]
tags:
  - log/meeting
  - status/inbox
  - source/hushscribe
---


# Call Recording — 2026-03-23 10:00

## Transcript

**You** (10:00:03)
Morning. Quick sync on the product launch. Where are we at?

**John** (10:00:07)
We're in good shape. QA signed off yesterday, marketing assets
are locked, landing page is live in staging.

Voice memos use type: fleeting with a single speaker. Same structure, same frontmatter.
Open any transcript in the Transcript Viewer to generate an AI summary (Highlights + To-Dos) on-device.

Build

See ARCHITECTURE.md for the full build instructions and project structure.

Permissions

Permission	When	Why
Microphone	All modes	Captures your voice
Screen Recording	Call Capture only	ScreenCaptureKit needs this for system audio from conferencing apps
Speech Recognition	Apple Speech model only	Required by SFSpeechRecognizer; one-time prompt

macOS re-prompts for Screen Recording permission roughly monthly. That's an OS thing, not HushScribe.

Architecture

See ARCHITECTURE.md for the full architecture overview and source tree.

Credits

HushScribe is a fork of Tome by Gremble-io, which itself started from OpenGranola, substantially extended with additional features. Code is generated with help of Claude Code.

Models and libraries:

FluidAudio by FluidInference — Parakeet-TDT v3 ASR and Silero VAD, used for the default transcription model and voice activity detection across all backends.
WhisperKit by Argmax — on-device Whisper inference on Apple Silicon, used for the Whisper Base and Whisper Large v3 model options. Whisper was originally developed by OpenAI.
mlx-swift-lm and mlx-swift by Apple — MLX inference stack for Swift, used to run LLM summary models on Apple Silicon.
Qwen3 by Alibaba Cloud — default on-device LLM used for AI summaries.
Gemma 3 / Gemma 4 by Google — alternative on-device LLMs for AI summaries.
pyannote.audio — speaker diarization model used for post-session speaker separation.

Changelog

See CHANGELOG.md for the full release history.

License

MIT