Show HN: Jamie, pull up – Podcast search engine rigged for The Joe Rogan podcast
jamiepullup.comLittle back story for more context:
A few weeks ago I was listening to Joe's episode with Tony Hinchcliffe. At some point, the discussion leads to Tom Cruise doing his own stunts. The conversation then devolves into this hypothetical situation where Cruise dies while performing a stunt and how odd it would be to watch the movie knowing Cruise dies in it. Or whether people would actually go see the movie. As morbid as the whole conversation was, I thought it was hilarious.
Days go by and I start to think back to that episode. I try to remember who the guest was so I can look it up and listen to it again, specifically the discussion around Tom Cruise. But I just couldn't remember anything. Surprisingly youtube search wasn't that helpful either. All I could remember from the episode was the phrase "Tom Cruise snuff film".
That's when I thought it would be great to be able to search Joe's podcasts (or any podcast for that matter). I spent some time researching what it would take to build out a search engine specific for a podcast. And after a few weeks of working on it, I got a prototype up and running.
And now if you look up "Tom Cruise snuff film", it brings up the episode and the specific spot where the discussion starts:
Cool. Searched for 'elk meat', was not disappointed. Are you using elastic for the search?
I'm using AWS's Cloud Search. I'm not sure what's actually behind it but I wouldn't be surprised if it's Lucene based.
Cool! Where are you getting the transcripts from?
I'm using AWS transcribe. I would love to eventually train my own model specifically for each podcast. For this specific instance, since Joe is moving the Spotify soon, I'm not sure the investment is worth it.