Faster inference won't save you

5 points by ramstar3000 25 days ago · 4 comments

Reader

The latency table says it all. Cloud-to-cloud is 40ms for 20 turns. Hotel Wi-Fi is 16 seconds. You can halve inference time and still have a broken product on bad connections.

Var1377 25 days ago

is this an LLM?

Var1377 25 days ago

does this mean you can disconnect from the internet entirely with the agent loop still running?

ramstar3000OP 25 days ago

yes this is central to our thesis :)

Settings

Faster inference won't save you

Keyboard Shortcuts