Stop using JSON for LLM structured output
nehmeailabs.comFor simple extraction tasks, a delimiter-separated string uses 11 tokens vs 35 for JSON. Output tokens are the latency bottleneck.
For simple extraction tasks, a delimiter-separated string uses 11 tokens vs 35 for JSON. Output tokens are the latency bottleneck.