GitHub - itsfabioroma/transcribee: Open source macOS video transcriber that builds a self-organizing knowledge base 🐝

Open source macOS transcriber for YouTube, Instagram Reels, TikTok, and local media — evolves a self-organizing knowledge base.

transcribee "https://youtube.com/watch?v=..."
transcribee "https://instagram.com/reel/..."
transcribee "https://vt.tiktok.com/..."
transcribee ~/Downloads/podcast.mp3

Over time, your ~/Documents/transcripts/ folder naturally evolves into a personal library:

transcripts/
├── AI-Research/
│   ├── ilya-sutskever-agi-2024/
│   └── anthropic-constitutional-ai/
├── Startups/
│   ├── ycombinator-how-to-get-users/
│   └── pmarca-founder-mode/
└── Health/
    └── huberman-sleep-optimization/

Each transcript is speaker-labeled and ready to paste into ChatGPT, Claude, or any LLM.

Why 🍯

I consume a lot of video content — YouTube, Instagram, TikTok, podcasts, interviews. I wanted to:

Ask questions about videos in LLMs
Have all that knowledge searchable and organized
Not do any manual work to maintain it

transcribee does exactly that. Transcribe once, knowledge stays forever.

Features 🪻

Transcribes YouTube, Instagram Reels, TikTok, and local audio/video files
Speaker diarization — identifies different speakers
Auto-categorizes transcripts using Claude based on content
Builds a knowledge library that organizes itself over time

Use with Clawdbot 🤖

transcribee is available as a Clawdbot skill. Just ask your agent to transcribe any YouTube video:

"Transcribe this video: https://youtube.com/watch?v=..."

Install the skill

# Install from ClawdHub (recommended)
clawdhub install transcribee

# Or clone manually
git clone https://github.com/itsfabioroma/transcribee.git ~/.clawdbot/skills/transcribee

Make sure you have the dependencies installed (brew install yt-dlp ffmpeg) and API keys configured.

Quick Start 🪺

# Install dependencies (macOS)
brew install yt-dlp ffmpeg
pnpm install

# Configure API keys
cp .env.example .env
# Add your ElevenLabs + Anthropic API keys to .env

# Transcribe anything
transcribee "https://youtube.com/watch?v=..."
transcribee "https://instagram.com/reel/..."
transcribee "https://vt.tiktok.com/..."
transcribee ~/Downloads/podcast.mp3
transcribee ~/Videos/interview.mp4

Shell alias (recommended)

Add to ~/.zshrc:

alias transcribee="noglob /path/to/transcribee/transcribe.sh"

Output 🍯

Each transcript saves to ~/Documents/transcripts/{category}/{title}/:

File	What it's for
`transcript.txt`	Speaker-labeled transcript — paste this into your LLM
`metadata.json`	Video info, language, auto-detected theme

Raw JSON (optional)

For power users who need word-level timestamps and confidence scores:

transcribee --raw "https://youtube.com/watch?v=..."

This adds transcript-raw.json with the full ElevenLabs response.

How it works 🐝

Downloads audio from YouTube (yt-dlp) or extracts from local video (ffmpeg)
Transcribes with ElevenLabs (scribe_v1_experimental with speaker diarization)
Claude analyzes content and existing library structure
Auto-categorizes into the right folder
Saves transcript files with metadata

Requirements

macOS (tested on Sonoma)
Node.js 18+
yt-dlp — brew install yt-dlp
ffmpeg — brew install ffmpeg
ElevenLabs API key — for transcription
Anthropic API key — for auto-categorization

Supported formats

Type	Formats
Audio	mp3, m4a, wav, ogg, flac
Video	mp4, mkv, webm, mov, avi
URLs	youtube.com, youtu.be, instagram.com/reel, tiktok.com

bzz bzz 🐝