Settings

Theme

Show HN: Offline audiobook from any format with one CLI command

github.com

105 points by C-Loftus a year ago · 45 comments · 1 min read

Reader

QuickPiperAudiobook locally generates an mp3 audiobook on Linux with one easy command. It can convert PDFs, epub, mobi, and many more by using ebook-convert. It uses any piper TTS model, and thus supports a wide variety of languages.

I've had great success using it to read more while reducing eye strain and computer usage. I think I've probably read 30 or so books this way now over the past year. Being able to listen to any content you want in audio form free and offline while going for a walk is extremely handy.

I hope it helps you as well!

Cheers

archargelod a year ago

Cool app! I've had some issues with getting it to work, though:

- ebook-convert is not a small dependency, it seems that it only comes bundled with calibre software. And calibre has huge number of python dependencies (>400 packages on OpenSuse) - don't know about you, but I'm not polluting my install with that for a small tool. So, I've grabbed appimage version of calibre, extracted it and added symlink to the bundled ebook-convert. It is still around ~500mb of wasted space, but atleast it's local to a single folder.

Could you replace it with another tool/library, or include only necessary stuff with binary?

- Then I've encountered another problem. I have no piper installed on my system, but readme says:

> You don't need to have piper installed. This program manages piper and the associated models.

It didn't download piper release and proceeded without errors. Then it did download some models. After that it errored out on trying to change directory to non-existent "~/.config/QuickPiperAudiobook/piper" So naturally, I looked in source code, found link to piper tarball and extracted it myself.

A-ha! Now it works. Until..

- Done. Saved audiobook as /home/archargelod/Audiobooks/text.wav

You could try to guess what was the problem, but I'm going tell you right away: it didn't create "Audiobooks" folder and again there were no errors.

Thankfully, that was the last issue and after I created ~/Audiobooks manually, my generated wav was there.

  • C-LoftusOP a year ago

    Thank you for the feedback and I'm sorry you had those issues. I cannot replicate at the moment on Ubuntu 24.04 but will check back on this. I presume it is something simple going wrong with how I am getting the home directory in golang and checking if the path exists.

    Your feedback on ebook-convert is very valid. I can take a look at breaking it up. (Granted I am not sure how much of a lift that would be)

    • andai a year ago

      The intention seems to have been to skip running ebook-convert if the input file is already a text file, but it runs it anyway. So I recompiled it to not do that.

      https://gist.github.com/avelican/8602b417e810f8dd4e31e8e3fbb...

      ...at which I did some more digging and realized that (for my purposes anyway -- operating on txt files), QPA can simply be replaced with piper itself!

          cat book.txt |  piper --model [model] --output_file book.wav
      
      (which I found kind of funny)

      Re: the ebook-convert dependency, I wonder if there are any feasible alternatives? My first thought was pandoc, which is ~140MB, but I guess that's smaller than Calibre's ~1400MB (!!!).

    • C-LoftusOP a year ago

      Issues should be fixed now in the latest release.

dewey a year ago

That's interesting, thanks for sharing. Does anyone know of a good solution for seamlessly switching between audiobooks and ebooks for books that are not bought from Amazon on Kindle?

In this case you already have the input file, and the audio output file but I guess there would be an app that takes these two files to provide a good reading experience. As they are based on the same source it should be possible to keep the reading progress matched between them.

  • babs42 a year ago

    Try out Storyteller, they're working on this exact problem: https://smoores.gitlab.io/storyteller/

    • dewey a year ago

      Very cool, thanks for sharing! I'll follow the project and hope there's some way to get this running on Kobo or other eInk readers in the future.

      • NoahKAndrews a year ago

        It's using an oft-ignored part of the ePub standard, so I think all that should be needed for Kobo support is implementation of that part of the standard in KOReader.

  • noch a year ago

    > Does anyone know of a good solution for seamlessly switching between audiobooks and ebooks for books that are not bought from Amazon on Kindle?

    Use Calibre's e-book viewer[^0] which uses Piper for text-to-speech.

    [^0]: https://manual.calibre-ebook.com/viewer.html#read-aloud

    • dewey a year ago

      Thanks, but clarification: I meant on iOS / mobile devices as I'm not reading on my computer. On second thought, it would be an amazing feature for https://prologue.audio, which is a beautiful app and works very well for audiobooks already.

      • SamBorick a year ago

        ReadEra Pro is an ereader app with a decent text-to-speech, I often flip between reading and listening.

senkora a year ago

I’ve really enjoyed moving most of my reading to TTS-generated audiobooks. I haven’t tried the newer AI voices but that certainly sounds like a step up!

falcolas a year ago

As a former audiobook narrator, may your cereal always be soggy and your socks too.

On a more serious note, this is a cool application of the technological advancement in AI voice models, and inevitable in today's society. It just really sucks to watch this race to the bottom actively put people out of work.

But hey, at least we can save a few bucks on an audiobook, right?

  • fsckboy a year ago

    >It just really sucks to watch this race to the bottom actively put people out of work

    the entire progress of civilization has depended on putting people out of work by increasing productivity and efficiency. Subsistence hunter-gatherers and subsistence farmers were put out of work by cheaper agriculture systems, and some of those unemployed realized they could support themselves by reading books to other people, a task they enjoyed much more.

    • falcolas a year ago

      > Subsistence hunter-gatherers and subsistence farmers were put out of work by cheaper agriculture systems, and some of those unemployed realized they could support themselves by reading books to other people,

      The replacement of hunter-gatherers by farming is a change that took centuries to take hold. Nobody lost their ability to feed their family because their ability to hunt and gather was automated away. Ironically, the move away from hunter/gatherer subsistence took free time away (for things like storytelling) instead of adding to it, in exchange for greater reliability in their sustenance.

      The loss of entire swaths of employment is a fairly new development. As is the lack of safety nets (US Centric for obvious reasons) for those who become injured or otherwise unable to sustain themselves.

    • mistrial9 a year ago

      this broad-brush take seems so persuasive.. for about one minute of thinking.. systems of humans are built for humans first.. which work of which humans are being replaced and why? Is anyone actually driving? If the modern answer is "money answers all questions" then, who makes money simply by moving money? Anyone who is not moving money right now is fair game because money is the only decider ?

      this superficial thinking is full of holes from the first examination, and, actively harms others.. and is an excuse to ignore the statements of a audio book narrator here.

    • zerotolerance a year ago

      The premise of this argument is false. Pre-agriculture people were food supply constrained. Nobody is audiobook or other entertainment supply constrained today. And worse, modern farm produce is effectively worth zero. In many cases farmers are paid NOT to produce specific goods. And those who do MUST produce at purely artificial levels as to require the use of unsustainable, patented, and specialized chemicals or GMOs to break even. This entire line of research leads to spam and waste.

      • zerotolerance a year ago

        I'll go further and say that audiobook production is not cost constrained unless the marketable value of the work is extremely low. What we get is cheap audiobooks for which there is no / low demand, and what it costs us is the decimation of the limited audiobook economy. That's happening at the same time as a billion new / fully generated works hit the market and overwhelm our ability to curate the supply and provide meaningful discovery. Again, more spam. Then AI spam to promote these valueless works. Awesome.

      • fsckboy a year ago

        >The premise of this argument is false. Pre-agriculture people were food supply constrained. Nobody is audiobook or other entertainment supply constrained today.

        your premise is not false, but your conclusions are. see "indifference curves" in econ 101, and Pareto optimality.

        We take consumer preferences as a given, because I don't know why you choose not to spend all of your money on the best and most pure essentials for life, but instead take some amount of your money and buy alcohol or skateboards or any of a number of other downright dangerous inessential things that you enjoy. You even pay money to GP to listen to his audio books when you could read them yourself and make money selling your own recordings. We don't know why you behave the way you do, but that's your choice. Given that you pay money for GP's audiobooks, if computer generated audiobooks drove the price down to zero, that would give you more money for alcohol; and it would give the rest of the economy a worker, GP, who could now participate in creating other products you'd probably want to buy (maybe dangerous jet-suits?) with the extra money you've saved on audiobooks.

        We don't need to figure out how everybody wants to spend their time or their money, people figure that out for themselves and markets emerge to accomodate them.

        We do need to figure out where the negative externalities lie, which you are attempting to do but knowing what qualities mark them as externalities will help you effectuate change by working with the market instead of failing against the market.

  • prennert a year ago

    It really depends on what you use this for. For recreational use on novels, high-quality human narrated audiobooks are surely still worth the money. Good pod-casts and radio-shows are overwhelmingly research, curation and writing combined with an engaging narration.

    This will only do narration, and the engagement is probably still not 100% there yet (sorry cant try it right now).

    This kind of thing is very useful to consume high-level information on the side, while driving, cooking, gardening or doing exercise. So it can be useful to make previously curated and written content more accessible. Including content people have curated themselves, or got a bot to curate for them.

    For example, I listened to the entire FT weekend edition while cycling on the weekend, using their text-to-audio function. This allowed me to take in even parts of the paper I normally do not have time to read. Before the advent of the text-to-speed function, I would have to chose between health and information. Now I can have both.

  • hyperG a year ago

    I consume many audiobooks and I usually love an audiobook narrator to the point they are a value add itself or I hate the way they speak and I literally can't read the book. The former is very rare though and the later much more common.

    The ability to change voices to one that suites a person's taste is hardly a race to the bottom. It is a HUGE value add.

    I am sure lamplighters were not happy about the light bulb either.

    c'est la vie.

    Breaking the audible monopoly sounds like a nice side effect too.

  • C-LoftusOP a year ago

    For what it is worth, I still listen to and love human-read audiobooks! However, it is particularly useful to have an AI option for books that are too niche for there to be an incentive for an individual to narrate them. Lots of academic and personal texts fall into those categories.

  • evereverever a year ago

    Yeah, I'm painstakingly reading out a novel I wrote and recording is hard. We had to scrap last nights recording because there was a hum at the place we are recording and even turning off all breakers we couldn't figure out where it was. Turns out it was a bathroom fan that was hooked to a different breaker in the house.

    I'll be writing some music for the intros of chapters and some special sections for suspense.

    • falcolas a year ago

      Hums are the worst. And the quieter your space, the more hums you seem to hear. Good luck!

      There's a few good folks on youtube that discuss some of the more nitty gritty details, if you're interested.

OptionOfT a year ago

Tangentially related: I like to leave my phone at home when I go exercise, and just listen to books via my watch (21st century problems...)

But to this date I cannot use Apple's Books app on the watch to listen to audiobooks I have on mp3/mp4a/... It only works with audiobooks you have purchased in their walled garden.

biomcgary a year ago

Is Piper currently the best open source TTS model? I occasionally review open models to see if they match elevenlabs and have been disappointed. However, Piper sounds better than the last time I listened around.

tekkk a year ago

Very interesting. I have listened to an AI audiobook once and although the inflection was somewhat jarring at first you got kinda used to it. I suppose it's good enough for your own use. And audiobook prices being what they are rather affordable one as well.

  • C-LoftusOP a year ago

    Yeah that is totally fair. In my experience, I feel that after a while your brain starts to tune out some of the inflection differences. Piper models are honestly pretty solid as well. I think in general, AI audiobook solutions like mine are better for non-fiction compared to fiction. Or at least that is what I read the most of

alexkubica a year ago

A very cool project, you should build a website interface you could easily charge for it or take donations/advertise on it if you want to keep it free

What would it take to add a specific language to piper? And do you know a good speech to text model?

cbluth a year ago

Very nice, I will give it a try! I looked at piper a bit, does this support multi speaker models?

  • C-LoftusOP a year ago

    Thanks! At the moment, I don't think there are any public multispeaker models I am aware of. I could be wrong though!

richerram a year ago

This is awesome, it was pretty easy to set up and start using it.

I have just one question/note to make: I tried a book in the Mexican Spanish language and noticed that it fails to catch the accents on the words (emphasis on words with tildes and strong accents on that syllable) and I am thinking it is because of the .pdf parsing since the Piper Voice Sample on their webpage example does it properly (on both avbailable voices).

Do you have an idea of what could exactly be happening and how I can try to solve it?

Thank you very much for the tool again!!!

Update: Ohh ok I just checked the repo Issues and found the one about polish accents, I tried "--speak-diacritics" but got the same "Error: failed to read file passed as input to piper: read /tmp/ebook-convert-xxxxxxx.txt file already closed". If I skip the diacritics option it converts fine.

  • richerram a year ago

    Update 2: I went to look at the code and although I have never done anything with Go I was pleased with how easy it is to read plus your code was pretty well structured.

    I realized the removal of diacritics was happening at the function RemoveDiacritics inside lib/textProcessing.go on line 26 and modified the definition(?) to not modify special characters, compiled again and voila! It worked great.

    After that I used Calibre to convert a couple .pdfs to .txt and with a pretty simple python script got rid of page footnotes/headers/page_numbers and I just ended up with pretty decent Audiobooks.

    Thanks again for the great tool!

leshokunin a year ago

Would this dedrm my audible stuff?

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection