Early Access — 3 regions live
Like Cloudflare, but for the full API path — outbound calls and inbound traffic. Rate limits, retries, failover, and fair queuing. Handled.
OpenAI returns 429? → We queue and retry. us-east-1 goes down? → We reroute to us-west. Customer A floods your API? → Customer B doesn't notice. 50 agents share one API key? → One global queue. No contention.
No credit card required · Why we built this
The difference
Cloudflare guards the front door.
EZThrottle guards both.
Most infrastructure only thinks about inbound. But your outbound calls have the same problems — rate limits, failures, fairness. We handle both sides.
Inbound
Protect your API
You're the API provider. One customer hammers your endpoints. Others wait. EZThrottle gives every customer their own queue — noisy neighbors can't starve the rest.
- ✓ Per-API-key queuing
- ✓ Noisy neighbor isolation
- ✓ Tier-based prioritization
- ✓ DDoS protection via queue limits
Outbound
Protect your calls
You're calling Stripe, OpenAI, GitHub. 50 distributed workers share one API key. They all fight over the same rate limit. EZThrottle gives each user's API key its own dedicated queue — no fighting, no contention.
- ✓ 429 handling with backoff
- ✓ Multi-region failover
- ✓ One queue per user, per API key
- ✓ Retry with region racing
New — A Queue Per User
Every API key gets its own lane.
This is what fair queuing looks like for HTTP. The same principle that lets your Netflix stream while someone else is downloading — applied to API calls.
Before — Shared Queue
customer_a (1000 req) ──┐
customer_b (2 req) ──┤──▶ [ SHARED QUEUE ] ──▶ api.stripe.com
customer_c (5 req) ──┘ ↑
customer_b waits hours
customer_c never runs
The noisy neighbor problem. One heavy customer fills the queue. Everyone else starves.
After — Queue Per User
customer_a ──▶ [Queue A] ──┐ customer_b ──▶ [Queue B] ──┤──▶ api.stripe.com customer_c ──▶ [Queue C] ──┘ customer_b executes at 2 req/sec ✓ customer_c executes at 5 req/sec ✓ customer_a executes at their pace ✓
Fair queuing. Every customer runs at their own pace. One flood never affects another.
Why distributed agents will feel this pain
Today, your single app hitting a rate limit is annoying. Tomorrow, you have 50 distributed workers all sharing sk_live_abc123 to call Stripe. They each think they have a 10 req/sec budget. They don't — they share it. Race conditions, duplicate requests, and thundering herd follow.
EZThrottle creates one dedicated queue per user, per API key — globally across your cluster. Every worker for that user routes through it. The rate limit is respected. No contention. This is resource contention solved at the infrastructure layer.
The bigger picture
EZThrottle is the network routing layer for HTTP.
A network router doesn't just forward packets. It manages flows, prioritizes traffic, routes around failures, and ensures no single connection starves the rest.
The internet figured this out for TCP/IP in 1988 — fair queuing, QoS, BGP failover. HTTP API calls have never had this. Every app builds its own retry logic, its own rate limiting, its own failover.
EZThrottle is that layer. Between your application and the APIs it depends on. The coordination infrastructure the next generation of the internet needs.
// Network routing layer for HTTP Your App / Agent │ ▼ ┌─────────────────────────────┐ │ EZThrottle │ │ │ │ ┌──────────────────────┐ │ │ │ Per-key queues │ │ │ │ Fair scheduling │ │ │ │ Flow control │ │ │ └──────────────────────┘ │ │ │ │ ┌──────────────────────┐ │ │ │ BGP-style failover │ │ │ │ Multi-region racing │ │ │ │ Health awareness │ │ │ └──────────────────────┘ │ └─────────────────────────────┘ │ ▼ The Internet (Stripe, OpenAI, GitHub...)
Network Router
- Per-flow queuing (QoS)
- BGP route failover
- Bandwidth fairness
- Traffic prioritization
EZThrottle
- Per-API-key queuing
- Multi-region failover
- Rate limit fairness
- Tier-based priority
Built for teams who can't afford API failures
Every API can fail. Your workflows shouldn't.
⚡
AI Agent Builders
Your agents call dozens of APIs per workflow. One 429 at step 8/10 crashes the entire sequence. Restart from scratch. Lose all context.
✓ 429 → queued, retried automatically
✓ API down → routed to another region
✓ 50 agents → 1 global queue, no fights
📊
Data Engineers
Your ETL processes millions of records. Manual rate limiting is slow, brittle. A 6-hour job fails at hour 4. You start over.
✓ Distributed rate limit coordination
✓ Resume from checkpoint on failure
✓ Regional failover for 500 errors
🛡️
API Builders
You're running an API. One enterprise customer hammers it. Your free-tier customers experience timeouts. You didn't build a DDoS target — but you have one.
✓ Per-customer queue isolation
✓ Reserved / paid / free tier priority
✓ Fair scheduling, no noisy neighbors
Architecture
How it works
BEAM/OTP for distributed coordination. Syn for global process registry. Fly.io for multi-region.
1
Job submitted
Your app POSTs to EZThrottle with target URL, tier, and optional webhook. We handle the rest.
2
Key detection → per-key routing
Has an API key? Routes to a dedicated AccountQueue — one per user, per API key, cluster-wide via Syn. Anonymous traffic uses consensus bidding.
3
Rate limiting enforced
2 req/sec default, adaptive via X-EZTHROTTLE-RPS header. Rate limits respected globally — not per-machine.
4
Execute with failover
429 → backoff + retry. 500 → try another region. Timeout → race fallback URL. All automatic.
5
Webhook delivery
Result delivered to your webhook. Multi-region racing, quorum support, on_success workflow chaining.
Why BEAM?
// Each queue is a BEAM process // Each job is a lightweight process // Syn = distributed process registry // 1M processes, microsecond scheduling AccountQueue["sk_live_abc"] ← syn.whereis() AccountQueue["sk_live_xyz"] ← syn.whereis() AccountQueue["sk_live_..."] ← syn.whereis() // One queue per user, per key, cluster-wide // Machine crashes → syn detects // Next request → queue respawns // Zero coordination code needed
WhatsApp runs 2 billion users on BEAM. Discord serves 150M concurrent connections. Ericsson built it for telecom in 1986. 40 years of distributed systems expertise in our runtime.
For API Providers
Signal your limits.
We enforce them.
Add two headers to your API responses. EZThrottle reads them and adjusts in real time — no config changes, no dashboard, no coordination. Your API tells the network how fast to go.
Combined with per-user queues, MAX-CONCURRENT becomes a true global limit — enforced per user, per key, across your entire customer base.
# Your API response headers # EZThrottle reads these automatically X-EZTHROTTLE-RPS: 10 → 10 req/sec per user, per key → adjusts in real time X-EZTHROTTLE-MAX-CONCURRENT: 5 → exactly 5 in-flight, globally → enforced per user, per key → not per machine — per user
Official SDKs
Python, Node.js, and Go. Integrate in 30 minutes.
🐍 Python
$ pip install ezthrottle
from ezthrottle import EZThrottle
client = EZThrottle("your_key")
resp = client.queue_request(
url="https://api.openai.com/...",
webhook_url="https://app.com/hook"
)
📦 Node.js
$ npm install ezthrottle
const { EZThrottle } = require('ezthrottle')
const client = new EZThrottle('key')
const r = await client.queueRequest({
url: 'https://api.stripe.com/...',
webhookUrl: 'https://app.com/hook'
})
🔷 Go
$ go get github.com/rjpruitt16/ezthrottle-go
client := ez.NewClient("key")
resp, _ := client.QueueRequest(
&ez.QueueRequest{
URL: "https://api.github.com",
WebhookURL: "https://app.com/hook",
},
)
Simple, transparent pricing
Start free. Scale as you grow.
Starter
Small projects
$50/mo
- ✓ 2M requests/mo
- ✓ All SDKs
- ✓ Priority routing
- ✓ Queue per user
Growth
Growing teams
$200/mo
- ✓ 5M requests/mo
- ✓ Priority routing
- ✓ Dashboard
- ✓ Queue per user
Most Popular
Pro
Production scale
$499/mo
- ✓ 10M requests/mo
- ✓ Priority routing
- ✓ Advanced dashboard
- ✓ Queue per user
Enterprise
Custom needs
Custom
- ✓ Unlimited requests
- ✓ Dedicated machines
- ✓ Custom SLAs
- ✓ Priority support
Queue per user available on all paid tiers. Free tier uses shared queue. · Overage: $0.0005/request over quota.
The coordination layer
your APIs are missing.
Start free. Integrate in 30 minutes. By the time your distributed agents start fighting over rate limits, you'll already have the infrastructure to handle it.
No credit card required.
Questions? support@ezthrottle.network