Settings

Theme

Show HN: A fully automated podcast – actually 12 podcasts

anchor.fm

49 points by holdenc137 4 years ago · 34 comments · 1 min read

Reader

"That Horoscope Podcast - Aquarius" and it's eleven siblings - are daily podcasts that are end-to-end programatically generated e.g. scripted, voiced, post-produced and uploaded.

Would love to get some first impression feed back and hear how others would achive the same thing!

planetsprite 4 years ago

Very good idea and execution. I think you could do a lot more interesting stuff than making it about horoscopes.

Also as a turing test you should make one and never reveal it's entirely automated until it develops a big following. Due to the speed of automation you could mass produce podcasts of different types until one sticks, then the ones gaining more traction put 10x more resources into, etc.

  • holdenc137OP 4 years ago

    I like your thinking :) I thought horoscopes (Because they are kind of repetitive) was a good first fit - because the scripts could be generated from stock fragments 'A chance encounter' etc...

    I think when the little glitches are ironed out of 'real' TTS - we'll be awash with generated content.

    • planetsprite 4 years ago

      I can imagine. Imagine a podcast generation system. You'd simply have to describe the personalities of the host(s), the topic, the general vibe of the theme music, length, hardcode any sponsors, and a GPT-4 powered MLops service could produce something liked by a list of demographics 8 times out of 10 in 5 minutes.

      There really is no stopping this train. In 2030 90% of internet content adopting the guise of being from the "real world" will be entirely generated by machine learning models. 90% of conversations you have with strangers online will likewise be with bots catered to influence you in subtle ways to maximize the return on revenue of your attention.

      • holdenc137OP 4 years ago

        In a parallel to email spam-bots, we get personal agents which (by your choice) filter out the generated content and make sure you only get the real deal.

        The fight is real.

        • planetsprite 4 years ago

          This battle between bots and detectors will consume 99% of computer resources by then. Total GANnihilation.

arrmn 4 years ago

Can you elaborate more on the TTS? Did you prerecord fragments (how many did you actually do?) and you just stich them together? So there is a may.mp3, 22.mp3 and your scripts just puts them together?

  • holdenc137OP 4 years ago

    Sure.

    For dates etc - you got it. I think from memory it would be 'Wednesday' + 'the 18th' + 'of' + 'may...' + '20' + '22'

    For the narrative speech it would be more words in a file. There are plenty of files (EDIT: just checked 350ish files that cover all the variations of script that can be generated at the moment)

    In general the TTS - part of the project is the 'art of the almost possible' (if TTS engines sounded really good - I'd have just used one of the shelf)

    • arrmn 4 years ago

      How did you come up with the initial list of phrases? Did you do some kind of analysis of other horoscopes?

      • holdenc137OP 4 years ago

        Listened to a few podcasts, read a few - then tried to come up with some combinations that (I hoped) were funny :)

        Here's all of the current 'starts' for the main prediction:

        (Note they're all pretty non commital - so anything could come next)

        A bite from a wild animal

        A financial matter coming to a head

        Completion of a long delayed task

        A seemingly generous gesture

        A sudden realisation

        An agreement with a headstrong peer

        An unavoidable slowdown

        Being pulled between two emotional options

        A sudden eruption of feelings

        Investigating a proverbial - light in the woods

        Involvement with a purely privte project

        Making peace with the past

        The chance of a big win

        Todays socialising

suprjami 4 years ago

This makes me dread that soon many other podcasts will be automated like this, and it'll be orders of magnitude more difficult to find good content than it already is.

  • jclos 4 years ago

    One person's dread will be another's business opportunity - is there any good search engine/recommender system for podcasts?

  • holdenc137OP 4 years ago

    Time to start lobbying for a 'generated'=true/false flag on RSS feeds?

    Also, I promise to only churn out inane content for LOLs.

Li7h 4 years ago

You have a text error in the description. A daily horoscope podcast for Aquariums. Also the episode for May 22nd narrates the date as April 22nd. But I love this concept. Is this an AI speech engine or pre recorded snippets? Where did you get the text snippets from? Have you thought of incorporating GPT-3 into your horoscopes ala co-star?

  • holdenc137OP 4 years ago

    The date was my mistake - the 'Aquariums' was for LOLs. (see also 'Librarians' and similar)

    It's prerecorded snippets that came out of my mouth ;)

tobr 4 years ago

I can’t say I understand the point of this, so my only feedback is that the date in the episode from Saturday 21st of May is announced as “Thursday 21st of April”.

  • holdenc137OP 4 years ago

    Yeah my bad - trust the human in the loop to put date in wrong.

    As to the point, its programming practice, perhaps a stepping stone to more elaborate content-generation systems, and jolly good fun too.

edent 4 years ago

Disturbingly accurate in my case. I've seen many arcane things today - including matches.

(Which TTS are you using? Or have I misunderstood?)

  • papathunk 4 years ago

    Haha.

    Sadly I don't think any (commercial / phoneme based) TTS would be very listenable for a podcast. Those are hand rolled fragments of speech. ( Think old school Satnavs "In " + "30 yards " + "turn left"

planetsprite 4 years ago

What vocal synthesis program did you use? Sounds 100% real at parts.

  • holdenc137OP 4 years ago

    Basically it is real. Because the possible scripts that can be generated are known - fragments of speech (eg 3,4,5 word phrases) were recorded (so the intonation is free).

    Would be great to do it with an off-the-shelf TTS engine but I don't think there quite there yet. I know my recording skills and microphone technique is rubbish - but if I knew what I was doing on that front - I think you'd be really hard pushed to tell it was stitched together phrases.

    • planetsprite 4 years ago

      The potential is 100x more with vocal synthesis imo. No need to make programmatic mad-libs style formats. Complete freedom, even though the quality isn't optimal.

      • holdenc137OP 4 years ago

        Totally agree. I think we're probably only a year or so off TTS that can put some proper intonation into a sentence - hopefully then they'll be indistiguishable from live speech.

        I've tried to listen to books with today's TTS and it soon becomes really grating (To my ears at least). It only needs the tinyest slip every few sentences and you can't listen any more.

vanous 4 years ago

Interesting! Is the code for the automation available?

  • papathunk 4 years ago

    Will clean it up if there's enough interest. It's in a few parts ... 'transcript' generation, then the speech assembly... then dropping in the backing track + intro / outro, then uploading.

    Which bit's of interest?

    • jasondigitized 4 years ago

      I’d love to see it to. The l-system you mentioned and how you are stitching together the audio clips.

mro_name 4 years ago

whatever there may be, it's drowned in ad- and spyware.

  • papathunk 4 years ago

    hmmm - it's on Anchor (spotify owned) - I wonder where the spyware is coming from.

number6 4 years ago

So how die you do it?

  • holdenc137OP 4 years ago

    The generation of the 'script' uses a kind of L-System - like production rules. There's a big file along the lines of:

    [the podcast] = [intro] [main body] [outro]

    [main body] = [main prediction] [lucky colours] [alibi] etc

    // these rules finally break down to text, eg

    [main prediction start] = "A bite from a wild animal" or "A chance encounter"§

    So the script has lots of combinations and is semi random - but it should always make sense.

visox 4 years ago

how many listeners did you got already from this ?

  • holdenc137OP 4 years ago

    I put an episode up for each star-sign this weekend.

    Most of them have had a 40-50 listens but the one linked here has 400+ listens!

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection