Settings

Theme

Show HN: A CLI tool to transcribe and clean YouTube videos with Whisper and LLMs

github.com

4 points by itsmevictor 6 months ago · 2 comments · 1 min read

Reader

Hi HN,

I built a simple command-line tool that quickly transcribes YouTube videos into clean, readable text. It uses OpenAI's Whisper for transcription and leverages the LLM of your choice to intelligently clean up the transcripts, removing filler words, correcting grammar, and improving readability.

Some highlights include:

- Automatically downloads audio directly from YouTube.

- Supports multiple output formats (TXT, SRT, VTT).

- LLM-driven transcript cleaning tailored for presentations, conversations, or lectures.

- Easy setup and straightforward CLI usage.

My main motivation to build this is that I read faster than I listen, and it is not rare that I'm interested in only a short segment of a (long) video, so it's easier to just cmd-F and jump in to that section in the transcript.

Feedback welcome!

Leftium 6 months ago

> My main motivation to build this is that I read faster than I listen

Yes! However occasionally I find it useful to refer to the original video (especially when I want to share a video at a certain timestamp.) Searchable transcripts are a great way to navigate a video if they have links that jump to the relevant timestamp in the video.

So I designed a special file format and web app based on oTranscribe + Markdown:

- https://raw.githubusercontent.com/Leftium/oTranscribe/refs/h...

- https://otranscribe.netlify.app/?vsl=definedefine

I made a tool to convert YouTube SBV/TTML files; it should be possible to add support for one of your output formats: https://github.com/Leftium/otrgen

---

There was a similar show HN[1] that opened my eyes to OpenAI Whisper, however your python script provides a better starting point than a bash script. I'll probably reference both projects when I make my own projects (including a beat-aware YouTube player that needs the audio data for beat-detection analysis.)

[1]: https://hw.leftium.com/#/item/41473379

  • itsmevictorOP 6 months ago

    Yes, you're right, that's a good idea! I just checked the oTranscribe Netlify app and I think it's pretty cool.

    However, I agree that it could be improved by having cleaner (transcribed) text. You should be able to integrate my approach pretty easily since srt and vtt output formats maintain the time stamps.

    Let me know if there's something I can do to make your life easier. Otherwise, naturally, feel free to fork my repo etc. :-)

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection