GitHub - avikalpg/byok-relay: Bring your own key (BYOK) relay to enable lightweight frontend-only apps to use LLMs using the user's API keys without hitting CORS.

4 min read Original article ↗

A minimal self-hosted relay server for Bring Your Own Key (BYOK) AI web applications.

The problem

Browser apps can't call AI APIs directly:

  • api.anthropic.com, api.openai.com, and most AI providers block browser requests via CORS
  • Putting API keys in frontend code exposes them to every user

The common workaround — a backend proxy — means the app developer holds the keys. That's a trust problem. Users have to trust you not to misuse or leak their keys.

byok-relay solves this differently: users bring their own keys, the relay stores them encrypted on your server, and proxies requests without ever returning the key. The user's key travels over the wire exactly once — when they register it.

How it works

Browser                  byok-relay              AI Provider
  │                           │                       │
  ├─ POST /users ────────────►│                       │
  │◄─ { token } ─────────────┤                       │
  │                           │                       │
  ├─ POST /keys/anthropic ───►│                       │
  │  { key: "sk-ant-..." }    │ (stored encrypted)    │
  │◄─ { ok: true } ──────────┤                       │
  │                           │                       │
  ├─ POST /relay/anthropic ──►│                       │
  │  x-relay-token: <token>   ├─ (real key injected) ►│
  │  { model, messages... }   │                       │
  │◄─ streamed response ──────┤◄─ streamed response ──┤

The token (not the API key) is stored in the browser. The API key stays server-side, encrypted at rest with AES-256-GCM.

Supported providers

Provider Name Notes
Anthropic anthropic Claude models, SSE streaming
OpenAI openai GPT models, SSE streaming
Google google Gemini API (key in query param)
Groq groq Fast inference, OpenAI-compatible
OpenRouter openrouter 200+ models via one API
Mistral mistral Mistral models
Any OpenAI-compatible openai-compatible Pass x-relay-base-url header — covers LiteLLM, Ollama, Perplexity, Together AI, and any other OpenAI-compatible endpoint

Adding a new built-in provider is ~5 lines in src/providers.js.

API

Register a user

POST /users
Content-Type: application/json

{ "app_id": "my-app" }

{ "token": "<relay-token>" } — store in browser localStorage

Store an API key

POST /keys/anthropic
x-relay-token: <token>
Content-Type: application/json

{ "key": "sk-ant-..." }

List stored providers (key values never returned)

GET /keys
x-relay-token: <token>

Delete a key

DELETE /keys/anthropic
x-relay-token: <token>

Relay a request

POST /relay/anthropic/v1/messages
x-relay-token: <token>
Content-Type: application/json
anthropic-version: 2023-06-01

{ "model": "claude-3-5-haiku-20241022", "max_tokens": 1024, "messages": [...], "stream": true }

Full streaming (SSE) is supported — the response is piped directly from the provider to the browser.

Generic OpenAI-compatible relay

POST /relay/openai-compatible/v1/chat/completions
x-relay-token: <token>
x-relay-base-url: https://openrouter.ai
Content-Type: application/json

{ "model": "...", "messages": [...] }

Quickstart (60 seconds)

# 1. Clone and install
git clone https://github.com/avikalpg/byok-relay.git && cd byok-relay && npm install

# 2. Configure
echo "ENCRYPTION_SECRET=$(openssl rand -hex 32)" > .env
echo "ALLOWED_ORIGINS=http://localhost:3000" >> .env

# 3. Start
npm start &

# 4. Register a user and get a token
TOKEN=$(curl -s -X POST http://localhost:3000/users \
  -H "Content-Type: application/json" \
  -d '{"app_id":"test"}' | python3 -c "import sys,json; print(json.load(sys.stdin)['token'])")

# 5. Store your Anthropic key
curl -X POST http://localhost:3000/keys/anthropic \
  -H "Content-Type: application/json" \
  -H "x-relay-token: $TOKEN" \
  -d '{"key":"sk-ant-YOUR-KEY-HERE"}'

# 6. Relay a request (streaming)
curl -X POST http://localhost:3000/relay/anthropic/v1/messages \
  -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -H "x-relay-token: $TOKEN" \
  -d '{"model":"claude-3-5-haiku-20241022","max_tokens":256,"stream":true,"messages":[{"role":"user","content":"Hello!"}]}'

Setup

1. Install

git clone https://github.com/avikalpg/byok-relay.git
cd byok-relay
npm install

2. Configure

cp .env.example .env
# Set ENCRYPTION_SECRET (generate: openssl rand -hex 32)
# Set ALLOWED_ORIGINS to your app's domain(s)

3. Run

Production (Ubuntu + systemd)

# Copy service file
sudo cp deploy/byok-relay.service /etc/systemd/system/
sudo systemctl enable --now byok-relay

# HTTPS with nginx + Let's Encrypt
sudo apt install nginx
sudo snap install --classic certbot
sudo certbot --nginx -d relay.yourdomain.com

Security

  • AES-256-GCM encryption — keys are encrypted at rest; the ENCRYPTION_SECRET lives only in your server environment
  • Keys never returned — the API after initial POST
  • Rate limiting — 100 req/min global, 20 AI req/min per token, 10 registrations/hour per IP
  • Startup validation — server refuses to start without a valid ENCRYPTION_SECRET
  • CORS — restrict ALLOWED_ORIGINS to your app's domain in production
  • HTTPS required in production (mixed-content browsers block HTTP endpoints called from HTTPS pages)

Trade-offs

  • You hold the encrypted keys — users trust your server. If your server is compromised and the ENCRYPTION_SECRET leaks, all keys could be decrypted. For higher assurance, replace SQLite with a cloud KMS-backed store.
  • No user accounts — the relay token is the only credential. Anyone who steals a user's localStorage token can use their stored key. Mitigate by scoping tokens to IP or adding optional auth.
  • Self-hosted — you're responsible for uptime, security updates, and backups.

License

Apache 2.0