Faster inference won't save you
graphcoder.aiThe latency table says it all. Cloud-to-cloud is 40ms for 20 turns. Hotel Wi-Fi is 16 seconds. You can halve inference time and still have a broken product on bad connections.
is this an LLM?
does this mean you can disconnect from the internet entirely with the agent loop still running?
yes this is central to our thesis :)