GitHub - ronb1964/TalkType: Privacy-first voice dictation for Linux Wayland — press a key to talk, release to type. Powered by Whisper AI, 100% offline, no subscription required.

TalkType — Speech-to-Text for Linux

Free, offline voice dictation for Linux — works on Wayland and X11 with any desktop environment. Powered by OpenAI's Whisper AI for accurate, private speech recognition. No cloud, no subscription, no data leaves your machine.

TL;DR: Download the AppImage, run it, press F8 to talk. Text appears where your cursor is. Works in any app — browsers, editors, terminals, chat apps, everywhere.

Why TalkType?

Most voice dictation tools on Linux are either cloud-based (privacy concerns), command-line only (not user-friendly), or broken on Wayland. TalkType is different:

100% offline — All processing happens locally using Whisper AI. Nothing is sent to the cloud.
Works on Wayland — Built from the ground up for modern Linux desktops (also supports X11).
Zero configuration — Download the AppImage, run it, start talking. First-run wizard handles the rest.
Any desktop environment — GNOME (with native shell extension), KDE, XFCE, Sway, Hyprland, and more.
GPU accelerated — Optional NVIDIA CUDA support for 3-5x faster transcription.

Screenshots

System tray menu • Recording indicator with timer

General settings with model selection • Advanced settings with GPU acceleration

Audio settings with microphone test • Custom voice commands

Built-in help with getting started guide • Complete voice commands reference

Features

Dual Hotkeys Always Active - F8 (hold-to-talk) AND F9 (tap-to-toggle) simultaneously - fully customizable
AI-Powered Transcription - Uses OpenAI's Whisper models (tiny to large-v3)
GPU Acceleration - Optional NVIDIA CUDA support for 3-5x faster transcription
Smart Text Processing - Auto-punctuation, smart quotes, auto-spacing
Voice Commands - Say "comma", "period", "new paragraph", "undo last word", and more
Custom Commands - Define your own phrase shortcuts (e.g., "my email" → your@email.com)
Visual Feedback - On-screen recording indicator with timer
GNOME Integration - Native shell extension for GNOME desktop
Auto-Updates - Built-in update checker with one-click downloads
Wayland Native - Works seamlessly on modern Linux desktops

Installation

Arch Linux (AUR)

yay -S talktype-appimage
# or
paru -S talktype-appimage

AppImage (All Distros)

Download the latest AppImage from Releases:

chmod +x TalkType-v*.AppImage
./TalkType-v*.AppImage

The AppImage includes everything needed - just download and run!

Note: AppImages require FUSE 2 (libfuse.so.2). Install if needed:

Fedora/RHEL: sudo dnf install fuse

Ubuntu/Debian: sudo apt install libfuse2

Arch/Manjaro: sudo pacman -S fuse2

openSUSE: sudo zypper install libfuse2

System Requirements

Requirement	Details
OS	Linux with Wayland
Dependencies	ydotool, wl-clipboard (must be installed on your system; ydotoold daemon is started automatically on first run)
Audio	Working microphone
GPU (optional)	NVIDIA GPU for CUDA acceleration

Quick Start

Launch TalkType - Run the AppImage or use your app launcher
First-run setup - TalkType will guide you through initial configuration
Start dictating - Press F8 (hold to record) or F9 (tap to toggle) — both always active
Speak naturally - Text appears where your cursor is
Use voice commands - Say "comma", "period", "new line", etc.

Hotkeys (Both Always Active)

Hotkey	How it works
F8	Hold to record, release to transcribe (hold-to-talk)
F9	Press once to start, press again to stop (tap-to-toggle)

Voice Commands

Punctuation

Say This	Result
"comma"	,
"period" / "full stop"	.
"question mark"	?
"exclamation point"	!
"colon"	:
"semicolon"	;
"open quote" / "close quote"	" " (smart quotes)
"dot dot dot" / "ellipsis"	...

Formatting

Say This	Result
"new line"	Line break
"new paragraph"	Double line break
"tab"	Tab character

Editing

Say This	Result
"undo last word"	Deletes last word
"undo last sentence"	Deletes to previous sentence
"undo everything"	Clears all dictated text

Literal Words

Say "literal" before any command to output the word instead:

"literal comma" → types "comma" (not ,)
"literal period" → types "period" (not .)

AI Models

Choose the right model for your needs in Preferences → General:

Model	Size	Speed	Accuracy	Best For
tiny	39 MB	Fastest	Basic	Quick notes
base	74 MB	Fast	Good	Casual use
small	244 MB	Balanced	Very Good	Recommended
medium	769 MB	Slower	Excellent	Professional
large-v3	~3 GB	Slowest	Best	Technical work

Tip: Start with "small" for everyday use. Enable GPU acceleration for larger models.

GPU Acceleration

TalkType supports NVIDIA CUDA for 3-5x faster transcription:

Automatic detection - TalkType detects your NVIDIA GPU on first run
One-click download - Download CUDA libraries (~800MB) when prompted
Automatic activation - GPU mode enables after download

You can also enable GPU later: Preferences → Advanced → Download CUDA Libraries

Configuration

Settings are stored in ~/.config/talktype/config.toml:

model = "small"           # AI model: tiny, base, small, medium, large-v3
device = "cpu"            # "cpu" or "cuda" (GPU)
hold_hotkey = "F8"        # Hold-to-talk key (hold to record, release to stop)
toggle_hotkey = "F9"      # Tap-to-toggle key (press once start, press again stop)
# Both hotkeys are always active simultaneously
language_mode = "auto"    # "auto" or specific language code
beeps = true              # Audio feedback sounds
smart_quotes = true       # Use curly quotes " "
auto_space = true         # Auto-space between utterances
auto_period = true        # Add period at end of sentences

Development

From Source

# Prerequisites (Fedora/Nobara)
sudo dnf install -y portaudio-devel ffmpeg ydotool wl-clipboard \
                    python3-gobject libappindicator-gtk3 libnotify

# Clone and install
git clone https://github.com/ronb1964/TalkType.git
cd TalkType
poetry install

# Run
poetry run dictate-tray

ydotool Setup

TalkType requires ydotool for text injection:

# Create systemd service
mkdir -p ~/.config/systemd/user
cat > ~/.config/systemd/user/ydotoold.service <<'EOF'
[Unit]
Description=ydotool daemon
After=graphical-session.target

[Service]
Environment=XDG_RUNTIME_DIR=%t
ExecStart=/usr/bin/ydotoold --socket-path=%t/.ydotool_socket
Restart=on-failure

[Install]
WantedBy=default.target
EOF

# Enable and start
systemctl --user daemon-reload
systemctl --user enable --now ydotoold.service

Troubleshooting

Text not appearing?

Check ydotoold is running: systemctl --user status ydotoold
Verify socket exists: ls $XDG_RUNTIME_DIR/.ydotool_socket

Hotkey not working?

Another app may be using F8/F9 - try different keys in Preferences
Ensure TalkType service is running (check tray icon)

Transcription slow?

Enable GPU acceleration if you have NVIDIA GPU
Try a smaller model (tiny or base)
Use Performance presets in tray menu

Tray icon not visible (GNOME)?

TalkType offers to install its GNOME extension on first run
Or manually: Preferences → Advanced → Install Extension

License

MIT License - see LICENSE file for details.

TalkType - Voice dictation that just works.
Download • Changelog • Report Bug • Request Feature