Butter is a cache that identifies patterns in LLM responses and saves you money by serving responses directly.
It's also deterministic, allowing your AI systems to consistently repeat past behaviors.
It's live — try it out here.
Chat Completions Compatible
Butter is a Chat Completions API endpoint, making it easy to drop right into favorite tools like LangChain, Mastra, Crew AI, Pydantic AI, AI Suite, Helicone, LiteLLM, Martian, Browser Use, DSPy, and more.
from openai import OpenAI
# Repoint your client
client = OpenAI(
base_url="https://proxy.butter.dev/v1",
)
# Requests now route through Butter
response = client.chat.completions.create()