Settings

Theme

Show HN: CTRL-F for YouTube Videos

github.com

137 points by ewild 2 years ago · 44 comments · 1 min read

Reader

This is a small project i made years ago and updated to whisper last year, i still use it from time to time and thought it might be useful to others, or just put the idea out there for someone better than me to make a better implementation!

modeless 2 years ago

Ctrl-F can already search the transcript on YouTube. I use it all the time. I guess this could be useful for videos YouTube doesn't have captions for.

  • stainablesteel 2 years ago

    im not able to do this, can you explain?

    • modeless 2 years ago

      1. In a desktop web browser, visit a YouTube video with captions, which is almost all of them

      2. Click the video description to expand it

      3. Scroll down and click the tiny "Show Transcript" button near the bottom (whoever decided to bury it down here was very misguided)

      4. Ctrl-F and search any word. Occurrences in the transcript will be highlighted and you can press enter to scroll the transcript to the next one. Click the transcript to seek the video.

      I see that this extension shows occurrences on the seek bar which is cool. There is also a slight problem with regular ctrl-F: if you search for a multiple word phrase you might not find it if the phrase happened to be split between two chunks of the transcript. So that could be better in this extension. And of course not every YouTube video has captions, but most do these days.

      • philsnow 2 years ago

        > visit a YouTube video with captions, which is almost all of them

        Depending on what you're watching, you might never come across a video with good subtitles but rather Youtube's auto-generated subtitles.

        Whisper can do a better job in a lot of cases, but not all... I wonder if they've had multiple generations of auto-captioning and not gone back and redone the ones that were done earlier.

        This extension is really interesting to me because in the past I've tried (and failed) to make a similar one that adds a new .vtt to the list of available subtitles for the video. I sometimes struggle with auditory processing, especially in a noisy environment, and following along with subtitles helps me out immensely, so it's frustrating when the auto-generated subtitles are poor quality. I've bookmarked the extension to see if I can fork it for that purpose in the future.

        • thaumasiotes 2 years ago

          > Depending on what you're watching, you might never come across a video with good subtitles but rather Youtube's auto-generated subtitles.

          Even with very little correspondence to the actual dialogue, if you already know what you're looking for, you can probably find it pretty easily in the auto-generated subtitles.

          ctrl+F won't work in that case, but reading will.

        • ewildOP 2 years ago

          if you have any questions feel free to ask!

      • ewildOP 2 years ago

        you are correct, originally youtube didnt have this when i made it in 2019 with deepspeech, now they do but i just always preferred the idea of it being on the timebar to just click and go right to it. tbh i should just make a simple addon to take the youtube timestamps and slap it onto the timebar. also for the split chunks this would have no problem there as the transcript is actually stored in a json file, so any concurrent words will always be matchable for phrases. ofc downside being you need to run the model lol

        • modeless 2 years ago

          I'd use an extension that made the transcript show by default on every video and added a transcript search bar in the page. That would be great.

          • ewildOP 2 years ago

            i guess i might aswell do it so i dont need to run a model everytime myself too lol ill have it done in a day or two

          • stevenicr 2 years ago

            If that also saved a copy of the transcript, with meta data added (title, channel, url, smilar vids) - as a text file on my local machine - I would actually use this as well.

            Wait, is this using a cloud service in some way or is it all local / total private? That would be a deal breaker or maker.

            Oh might as well copy a screen shot of the thumbnail and save it.

          • katella 2 years ago

            I built an extension that injected a search bar into the transcript card. Worked by filtering the YouTube transcripts themselves, and manipulating their display attribute.

            Didn't release it to the store because YouTube released a search feature and it looked exactly like mine.

          • ewildOP 2 years ago

            would you prefer if the timestamp was hidden since it takes up a bigass portion of the screen or that being an option to hide it in the extension settings?

            • modeless 2 years ago

              I think the timestamp is OK, my biggest complaint is the huge amount of whitespace between the rows and the small size of the box. If I designed YouTube I would put the transcript on the left side above the video description, with a button that expands it to full height so there's no separate scrollbar for the transcript anymore, it's just all directly in the page.

              BTW when I went to look at a video just now, YouTube actually served me a "Search in Video" box at the top of the transcript. So I guess the feature exists, they just haven't rolled it out to everyone yet.

              • ewildOP 2 years ago

                damn i see this after im 90% done and just have to make a fancy button lol

      • madacol 2 years ago

        bookmarklet to "Show transcript"

            javascript:document.querySelector('button[aria-label="Show transcript"]').click()
        
        <https://getbookmarklets.com/scripts/data%3Atext%2Fjavascript...>
        • madacol 2 years ago

          As a userscript https://github.com/madacol/web-automation/blob/master/usersc...

              // ==UserScript==
              // @name        Always show transcript
              // @match       https://www.youtube.com/watch*
              // @grant       none
              // @version     1.0
              // @author      madacol
              // @description show transcript on all youtube videos
              // @run-at      document-idle
              // ==/UserScript==
              (async ()=>{
                  (await getElementNotYetRendered(()=>document.querySelector('button[aria-label="Show transcript"]'))).click()
          
                  function getElementNotYetRendered(elementGetter, delay = 200, timeout = 10000) {
                      let retries = Math.ceil(timeout / delay);
                      return new Promise((resolve, reject) => {
                          (function resolveIfElementFound() {
                              setTimeout(() => {
                                  const element = elementGetter()
                                  if (element?.toString().includes("Element")) return resolve(element)
                                  if (element?.toString().includes("NodeList") && element.length > 0) return resolve(element)
          
                                  if (retries-- <= 0) return console.error(`Max retries reached: element was not found
                                  element: "${element}"
                                  elementGetter: "${elementGetter}"
                                  `);
                                  resolveIfElementFound()
                              }, delay);
                          })()
                      })
                  }
              })();
          • madacol 2 years ago

            Vastly simplified userscript to only show transcript when pressing Ctrl+f

            https://github.com/madacol/web-automation/blob/master/usersc...

                // ==UserScript==
                // @name        Show transcript on Ctrl+f
                // @match       https://www.youtube.com/watch*
                // @grant       none
                // @version     1.0
                // @author      madacol
                // ==/UserScript==
                document.addEventListener('keydown', event => {
                    if (event.ctrlKey && event.key === 'f')
                        document.querySelector('button[aria-label="Show transcript"]').click()
                })
    • atahanacar 2 years ago

      You can find a button for the transcript in the description (or the three dot menu near the dislike button if it's still serving you the older interface). You have to open the transcript first, then Ctrl+f

  • meatjuice 2 years ago

    Yeah that’s exactly what I thought just after finding out it uses whisper for transcribing. Why not use it when it’s already transcribed?

a_wild_dandan 2 years ago

If we had an extension to skip all the filler garbage in YT videos, I would be ecstatic. Maybe that's doable now? YT captions -> identify fluff timestamps via a browser LLM -> insert segments onto the video timeline, which automatically skip, a la SponsorBlock.

We could slash through Youtubers repeating themselves, making hack jokes, narrating their video title & outline, vapid explanations of common knowledge, etc. Any of which can be customized to your taste via a system prompt!

This kinda semantic filter would actually be an immensely powerful UI tool for all webpages and media, now that I think about it...

  • sergiotapia 2 years ago

    just use sponsorblock today, works fine on all my devices. https://github.com/ajayyy/SponsorBlock

    from mobile phone to tv to pc.

    • jsheard 2 years ago

      Do check the settings too, SponsorBlock is best known for skipping sponsored segments but it also has markers for things like intros, previews, self-promotion, and filler jokes/skits which aren't skipped by default but can be if you want them to.

      • extraduder_ire 2 years ago

        The category for "non-music section" for music videos is great. Would probably make the extension worth it, if that was all it did.

  • progman32 2 years ago

    Also look into DeArrow. Replaces clickbait titles and thumbnails.

BetterWhisper 2 years ago

Developed https://www.videototextai.com/ exactly for this reason as it was quite impossible to search videos otherwise. Also you can copy the transcript into a LLM and ask questions from video content like that.

qntmfred 2 years ago

I've been using https://www.appblit.com/scribe to get transcripts into a more readable/ctrl+f-able format

  • ewildOP 2 years ago

    yeah I remember the whole transcript youtube coming out a yearish after i made the first version of this in 2019, but i still perfer the timebar highlighting, but thats just a preference thing

rustybamba 2 years ago

Could you explain what's the purpose of the model.pth? I'm trying to get it to work on my Apple Silicon Mac.

  • ewildOP 2 years ago

    The model.pth is a custom LSTM for detecting phonetic similarity, as long as you're running it from the pythons folder ( I didn't manage file location very well) it should work.

lopkeny12ko 2 years ago

https://filmot.com/

mutant 2 years ago

If you're offloading transcription to openai, why have a local gfx card?

sph 2 years ago

When all you have is a GPU, every problem can be solved with a custom AI model.

popf1 2 years ago

That's cool, but there is also Firefox extensions that does something similar. There's one for searching comments, and one for searching caption.

https://addons.mozilla.org/en-US/firefox/addon/youtube-capti...

https://addons.mozilla.org/en-US/firefox/addon/ycs/

  • ewildOP 2 years ago

    ahh never really looked cause i built my original one in 2019 off of Deepspeech haha just updated it for fun mostly. I know youtube captions themselves are good, but one thing on his code would be that not all videos have captions. Since mine actually downloads the audio and runs it, it would still have values on those older videos that never got captions

  • sammyatman 2 years ago

    Ctrl-F across all of youtube: https://www.askyoutube.ai

    • modeless 2 years ago

      https://www.youglish.com is also a sort of search engine for YouTube captions, though mostly aimed at short phrases.

    • popf1 2 years ago

      I searched for my YouTube username and then for the exact title of one video I posted and it didn't find either one.... instead it said the title of my video was not true because it didn't interpret it correctly (but it didn't link to the video).

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection