Rate limiter for LLMs outperforms exponential backoff
github.comrateLLMiter is a Python rate limiter that smoothes out requests to LLM APIs to get faster, more consistent performance. It uses a ticket bucket algorithm rather than the usual exponential backoff.