EZThrottle — API Rate Limit Router

Early Access — 3 regions live

Like Cloudflare, but for the full API path — outbound calls and inbound traffic. Rate limits, retries, failover, and fair queuing. Handled.

OpenAI returns 429? → We queue and retry. us-east-1 goes down? → We reroute to us-west. Customer A floods your API? → Customer B doesn't notice. 50 agents share one API key? → One global queue. No contention.

No credit card required · Why we built this

The difference

Cloudflare guards the front door.
EZThrottle guards both.

Most infrastructure only thinks about inbound. But your outbound calls have the same problems — rate limits, failures, fairness. We handle both sides.

Inbound

Protect your API

You're the API provider. One customer hammers your endpoints. Others wait. EZThrottle gives every customer their own queue — noisy neighbors can't starve the rest.

✓ Per-API-key queuing
✓ Noisy neighbor isolation
✓ Tier-based prioritization
✓ DDoS protection via queue limits

Outbound

Protect your calls

You're calling Stripe, OpenAI, GitHub. 50 distributed workers share one API key. They all fight over the same rate limit. EZThrottle gives each user's API key its own dedicated queue — no fighting, no contention.

✓ 429 handling with backoff
✓ Multi-region failover
✓ One queue per user, per API key
✓ Retry with region racing

New — A Queue Per User

Every API key gets its own lane.

This is what fair queuing looks like for HTTP. The same principle that lets your Netflix stream while someone else is downloading — applied to API calls.

Before — Shared Queue

customer_a (1000 req) ──┐
customer_b (2 req)   ──┤──▶ [  SHARED QUEUE  ] ──▶ api.stripe.com
customer_c (5 req)   ──┘         ↑
                         customer_b waits hours
                         customer_c never runs

The noisy neighbor problem. One heavy customer fills the queue. Everyone else starves.

After — Queue Per User

customer_a ──▶ [Queue A] ──┐
customer_b ──▶ [Queue B] ──┤──▶ api.stripe.com
customer_c ──▶ [Queue C] ──┘

customer_b executes at 2 req/sec  ✓
customer_c executes at 5 req/sec  ✓
customer_a executes at their pace ✓

Fair queuing. Every customer runs at their own pace. One flood never affects another.

Why distributed agents will feel this pain

Today, your single app hitting a rate limit is annoying. Tomorrow, you have 50 distributed workers all sharing sk_live_abc123 to call Stripe. They each think they have a 10 req/sec budget. They don't — they share it. Race conditions, duplicate requests, and thundering herd follow.

EZThrottle creates one dedicated queue per user, per API key — globally across your cluster. Every worker for that user routes through it. The rate limit is respected. No contention. This is resource contention solved at the infrastructure layer.

The bigger picture

EZThrottle is the network routing layer for HTTP.

A network router doesn't just forward packets. It manages flows, prioritizes traffic, routes around failures, and ensures no single connection starves the rest.

The internet figured this out for TCP/IP in 1988 — fair queuing, QoS, BGP failover. HTTP API calls have never had this. Every app builds its own retry logic, its own rate limiting, its own failover.

EZThrottle is that layer. Between your application and the APIs it depends on. The coordination infrastructure the next generation of the internet needs.

// Network routing layer for HTTP

Your App / Agent
      │
      ▼
┌─────────────────────────────┐
│       EZThrottle            │
│                             │
│  ┌──────────────────────┐   │
│  │  Per-key queues      │   │
│  │  Fair scheduling     │   │
│  │  Flow control        │   │
│  └──────────────────────┘   │
│                             │
│  ┌──────────────────────┐   │
│  │  BGP-style failover  │   │
│  │  Multi-region racing │   │
│  │  Health awareness    │   │
│  └──────────────────────┘   │
└─────────────────────────────┘
      │
      ▼
  The Internet
  (Stripe, OpenAI, GitHub...)

Network Router

Per-flow queuing (QoS)
BGP route failover
Bandwidth fairness
Traffic prioritization

EZThrottle

Per-API-key queuing
Multi-region failover
Rate limit fairness
Tier-based priority

Built for teams who can't afford API failures

Every API can fail. Your workflows shouldn't.

⚡

AI Agent Builders

Your agents call dozens of APIs per workflow. One 429 at step 8/10 crashes the entire sequence. Restart from scratch. Lose all context.

✓ 429 → queued, retried automatically

✓ API down → routed to another region

✓ 50 agents → 1 global queue, no fights

📊

Data Engineers

Your ETL processes millions of records. Manual rate limiting is slow, brittle. A 6-hour job fails at hour 4. You start over.

✓ Distributed rate limit coordination

✓ Resume from checkpoint on failure

✓ Regional failover for 500 errors

🛡️

API Builders

You're running an API. One enterprise customer hammers it. Your free-tier customers experience timeouts. You didn't build a DDoS target — but you have one.

✓ Per-customer queue isolation

✓ Reserved / paid / free tier priority

✓ Fair scheduling, no noisy neighbors

Architecture

How it works

BEAM/OTP for distributed coordination. Syn for global process registry. Fly.io for multi-region.

Job submitted

Your app POSTs to EZThrottle with target URL, tier, and optional webhook. We handle the rest.

Key detection → per-key routing

Has an API key? Routes to a dedicated AccountQueue — one per user, per API key, cluster-wide via Syn. Anonymous traffic uses consensus bidding.

Rate limiting enforced

2 req/sec default, adaptive via X-EZTHROTTLE-RPS header. Rate limits respected globally — not per-machine.

Execute with failover

429 → backoff + retry. 500 → try another region. Timeout → race fallback URL. All automatic.

Webhook delivery

Result delivered to your webhook. Multi-region racing, quorum support, on_success workflow chaining.

Why BEAM?

// Each queue is a BEAM process
// Each job is a lightweight process
// Syn = distributed process registry
// 1M processes, microsecond scheduling

AccountQueue["sk_live_abc"] ← syn.whereis()
AccountQueue["sk_live_xyz"] ← syn.whereis()
AccountQueue["sk_live_..."] ← syn.whereis()

// One queue per user, per key, cluster-wide
// Machine crashes → syn detects
// Next request → queue respawns
// Zero coordination code needed

WhatsApp runs 2 billion users on BEAM. Discord serves 150M concurrent connections. Ericsson built it for telecom in 1986. 40 years of distributed systems expertise in our runtime.

For API Providers

Signal your limits.
We enforce them.

Add two headers to your API responses. EZThrottle reads them and adjusts in real time — no config changes, no dashboard, no coordination. Your API tells the network how fast to go.

Combined with per-user queues, MAX-CONCURRENT becomes a true global limit — enforced per user, per key, across your entire customer base.

Read the full provider guide →

# Your API response headers
# EZThrottle reads these automatically

X-EZTHROTTLE-RPS: 10
  → 10 req/sec per user, per key
  → adjusts in real time

X-EZTHROTTLE-MAX-CONCURRENT: 5
  → exactly 5 in-flight, globally
  → enforced per user, per key
  → not per machine — per user

Official SDKs

Python, Node.js, and Go. Integrate in 30 minutes.

🐍 Python

$ pip install ezthrottle

from ezthrottle import EZThrottle

client = EZThrottle("your_key")
resp = client.queue_request(
    url="https://api.openai.com/...",
    webhook_url="https://app.com/hook"
)

View on PyPI →

📦 Node.js

$ npm install ezthrottle

const { EZThrottle } = require('ezthrottle')

const client = new EZThrottle('key')
const r = await client.queueRequest({
  url: 'https://api.stripe.com/...',
  webhookUrl: 'https://app.com/hook'
})

View on npm →

🔷 Go

$ go get github.com/rjpruitt16/ezthrottle-go

client := ez.NewClient("key")
resp, _ := client.QueueRequest(
  &ez.QueueRequest{
    URL:        "https://api.github.com",
    WebhookURL: "https://app.com/hook",
  },
)

View on GitHub →

Simple, transparent pricing

Start free. Scale as you grow.

Free

Get started

$0/mo

✓ 1M requests/mo
✓ All SDKs
✓ Webhook delivery
— Queue per user

Get Started

Starter

Small projects

$50/mo

✓ 2M requests/mo
✓ All SDKs
✓ Priority routing
✓ Queue per user

Get Started

Growth

Growing teams

$200/mo

✓ 5M requests/mo
✓ Priority routing
✓ Dashboard
✓ Queue per user

Get Started

Pro

Production scale

$499/mo

✓ 10M requests/mo
✓ Priority routing
✓ Advanced dashboard
✓ Queue per user

Get Started

Enterprise

Custom needs

Custom

✓ Unlimited requests
✓ Dedicated machines
✓ Custom SLAs
✓ Priority support

Contact Sales

Queue per user available on all paid tiers. Free tier uses shared queue. · Overage: $0.0005/request over quota.

The coordination layer
your APIs are missing.

Start free. Integrate in 30 minutes. By the time your distributed agents start fighting over rate limits, you'll already have the infrastructure to handle it.

No credit card required.

Questions? support@ezthrottle.network

Cloudflare guards the front door. EZThrottle guards both.

Protect your API

Protect your calls

Every API key gets its own lane.

Why distributed agents will feel this pain

EZThrottle is the network routing layer for HTTP.

Built for teams who can't afford API failures

AI Agent Builders

Data Engineers

API Builders

How it works

Job submitted

Key detection → per-key routing

Rate limiting enforced

Execute with failover

Webhook delivery

Signal your limits.We enforce them.

Official SDKs

Simple, transparent pricing

Free

Starter

Growth

Pro

Enterprise

The coordination layer your APIs are missing.

Cloudflare guards the front door.
EZThrottle guards both.

Signal your limits.
We enforce them.

The coordination layer
your APIs are missing.