Ask HN: Is there any tool that can stop LLM calls at runtime (not just monitor)?
I’ve been running into cases where LLM/agent systems make unexpected or repeated calls and costs spike quickly.
Most tools I’ve found focus on observability (logs, traces, dashboards), but not actual enforcement.
Is there anything that can:
- stop or cut off a call mid-execution (based on budget, tokens, or conditions)?
- enforce limits at runtime instead of just alerting after the fact?
Curious if people here are solving this in practice, or just handling it at the application level.
No comments yet.