Settings

Theme

Some Latency Metrics for Voice UIs

writingisthinkng.substack.com

2 points by fatruchir a month ago · 1 comment

Reader

fatruchirOP a month ago

Continuing on the journey to get my hands dirty with voice UIs - I put down some user perceived latency metrics I was seeing when building VUIs.

Key points: - I used the 'pipeline' approach of STT + LLM + TTS (as opposed to the S2S approach eg: gpt-realtime) - This approach (with my specific setup) - yielded latency far greater than the 500ms target, where conversations feel "natural" and there aren't any awkward silences - With the LLM as gpt-5-mini I saw latency at ~1.4s and with the LLM as Llama 3.1-8b on Cerebras I saws 1.1s

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection