Ask HN: Cheaper conversational voice API (~10x cheaper than ElevenLabs)?
I was building a real-time AI voice chat app and couldn’t make ElevenLabs work economically, their per-minute prices are way too high. I needed to find a solution that had: * expressive, human-sounding voices * low-latency * low cost
I couldn’t find anything that combined these, so I reluctantly built and self-hosted a conversational voice pipeline (STT → LLM → TTS). It sounds close to ElevenLabs, but costs ~10× less per minute at scale. However, it’s difficult to maintain, and requires serious GPU capacity, so it feels like total overkill for just my app.
I’m considering exposing this as a turn-key conversational voice API or embeddable widget.
Is this something others would want to use? I haven't built a demo but you can test out my setup on my app:
iOS: https://apps.apple.com/us/app/echo-tavern/id6754861981
desktop: https://echotavern.ai