A Cache For Your LLM - NFHN Reader

Butter is a cache that identifies patterns in LLM responses and saves you money by serving responses directly.

It's also deterministic, allowing your AI systems to consistently repeat past behaviors.

Chat Completions Compatible

Butter is a Chat Completions API endpoint, making it easy to drop right into favorite tools like LangChain, Mastra, Crew AI, Pydantic AI, AI Suite, Helicone, LiteLLM, Martian, Browser Use, DSPy, and more.


from openai import OpenAI

# Repoint your client
client = OpenAI(
    base_url="https://proxy.butter.dev/v1", 
)

# Requests now route through Butter
response = client.chat.completions.create()