Awesome YouTube Video Summary/Podcast/Video
A Python script to generate summaries (Claude), podcasts (Whisper), and videos (RunwayML or Luma AI) from annoyingly long YouTube content.
Example
- Original video: https://www.youtube.com/watch?v=_K-L9uhsBLM
- Summary: https://dl.dropbox.com/scl/fi/mdkbglfbs4m9ydeo9a2k7/video-_K-L9uhsBLM.mp4?rlkey=3wrowryg9gio1walaxhdbp2is&dl=0
Features
- Generate concise summaries of YouTube videos
- Create engaging podcast scripts with multiple voices
- Generate AI-powered videos with synchronized podcast audio
- Support for multiple languages
- Multiple transcription options
- Multiple video generation providers
Installation
- Clone the repository:
git clone https://github.com/sliday/ytsum.git
cd ytsum- Install dependencies:
pip install -r requirements.txt
- Install FFmpeg (required for audio/video processing):
- macOS:
brew install ffmpeg - Ubuntu/Debian:
sudo apt-get install ffmpeg - Windows: Download from FFmpeg website
- macOS:
Environment Setup
Create a .env file with your API keys:
ANTHROPIC_API_KEY=your_claude_api_key
OPENAI_API_KEY=your_openai_api_key
LUMAAI_API_KEY=your_lumaai_api_key
RUNWAYML_API_SECRET=your_runwayml_api_key
REPLICATE_API_TOKEN=your_replicate_api_key
Usage
Basic Summary
python ytsum.py "https://www.youtube.com/watch?v=VIDEO_ID"Generate Podcast
python ytsum.py --podcast "https://www.youtube.com/watch?v=VIDEO_ID"Generate Video with Podcast
# Using Luma AI (faster, recommended) python ytsum.py --podcast --lumaai "https://www.youtube.com/watch?v=VIDEO_ID" # Using RunwayML python ytsum.py --podcast --runwayml "https://www.youtube.com/watch?v=VIDEO_ID"
Additional Options
--language: Specify output language (default: english)--ignore-subs: Force transcription even when subtitles exist--fast-whisper: Use Fast Whisper for transcription (faster)--whisper: Use OpenAI Whisper for transcription (more accurate)--replicate: Use Replicate's Incredibly Fast Whisper
Output Files
All output files are saved in the out directory:
summary-{video_id}.txt: Text summarypodcast-{video_id}.txt: Podcast scriptpodcast-{video_id}.mp3: Podcast audiovideo-{video_id}.mp4: Final video with podcast audio
Video Generation
The tool supports two AI video generation providers:
Luma AI (Recommended)
- Faster generation times
- High-quality cinematic videos
- Supports camera movements and scene transitions
- Maintains visual consistency
- Optional image input for style reference
RunwayML
- High-quality video generation
- Requires input image
- Longer processing times
- Professional-grade output
Both providers:
- Generate base images using Flux AI
- Create video segments based on podcast content
- Combine segments with audio
- Support custom duration and aspect ratio
Transcription Options
-
Fast Whisper (Default)
- Quick transcription
- Good accuracy
- No API key required
-
OpenAI Whisper
- High accuracy
- Slower processing
- Requires OpenAI API key
-
Replicate Whisper
- Fastest option
- Good accuracy
- Requires Replicate API key
Testing
Run the test suite:
Run specific test groups:
# Run Luma AI tests only pytest -v -m luma # Run RunwayML tests only pytest -v -m runway
Dependencies
anthropic: Claude API for text generationopenai: Whisper API for transcription and TTSlumaai: Luma AI for video generation (recommended)runwayml: RunwayML for video generationreplicate: Flux AI for image generationffmpeg-python: Audio/video processingcolorama: Terminal output formattingpytest: Testing framework
Contributing
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.

