Take the reins.
Stop juggling AI tools.
Start orchestrating them.
One saddle. One brain. Patch in.
Scale on your terms. No artificial limits. Self-hosted: unlimited workers on your own hardware. SaaS: plans that grow with you. Every machine you own is a potential worker node.
Worker ↔ AI Providers
dashboard never sees your keys
Zero-knowledge. Your keys stay yours. Workers talk directly to providers using your API keys — not subscriptions that can be revoked. Prompts, code, responses never pass through our servers. We orchestrate. We don't eavesdrop. And we don't lock you in.
No vendor lock-in. Ever. Vendors change terms overnight. Subscriptions get restricted. APIs get throttled. ModelReins workers use direct API keys — no subscriptions required, no vendor lock-in, no surprises.
Opus rate limited
↓ auto-failover
Sonnet routed
Rate limited? Work never stops. Hit a limit on Opus — the router instantly fails over to Sonnet. Sonnet full? Ollama picks it up locally for free. Zero downtime. Zero babysitting.
ModelReins 4.4.15 Companion · Windows
Install everything in one click.
The Companion packs the local routing brain, a fleet worker, and the Wall into one installer. It detects the AI tools you already have, runs offline by default, and only escalates to a frontier model when the question deserves it.
The Director
A small local model classifies your prompts and picks the right worker. Resolves ~95% of dispatches without the cloud.
A fleet worker
Earns its keep when your machine is idle. Runs jobs assigned by the router. Your idle cycles, your capacity.
The Wall
Ambient mission control for your Patch. Open it on a spare monitor and watch the fleet think. Walks-between-rooms continuity coming with the mesh.
or via npm:
npm install -g modelreins-worker
The privacy difference
Meta can know where you and your family are on vacation and sell that.
We don't care and can't even see that info.
When you install ModelReins, you get Bob — a local brain that lives on your hardware. One Bob per fleet. Every companion you install shares the same intelligence. Your memories, routing patterns, and learned behaviors never leave your machines. Our servers handle dispatch and coordination. Bob handles everything else.
You stay because you want to, not because you have to. If you leave, Bob goes with you. Nothing held hostage. Nothing phoned home.
Personal infrastructure. Built for you, not on you.
Why now
The world just agreed.
Today Anthropic announced Project Glasswing — a $100M coalition with AWS, Apple, Cisco, CrowdStrike, Google, JPMorganChase, Microsoft, NVIDIA, Palo Alto Networks, the Linux Foundation, Broadcom, and more — using Claude Mythos Preview to find zero-days in critical infrastructure. Mythos has already found vulnerabilities that survived 27 years of human review and 5 million automated tests.
Twelve of the biggest tech companies in the world just publicly agreed: AI has crossed a threshold. The same capabilities that make a model powerful enough to find a 27-year-old zero-day in OpenBSD make it powerful enough to need governance.
The window between a vulnerability being discovered and being exploited by an adversary has collapsed — what once took months now happens in minutes with AI. That is not a reason to slow down; it’s a reason to move together, faster. If you want to deploy AI, you need security. That is why CrowdStrike is part of this effort from day one.
— Elia Zaitsev, CTO, CrowdStrike (Project Glasswing announcement, April 8, 2026)
Project Glasswing is the model layer. ModelReins is the orchestration and governance layer. Same problem, different rung. Both essential.
The ceiling is moving
93.9%
SWE-bench Verified
Claude Mythos Preview
82.0%
Terminal-Bench 2.0
Claude Mythos Preview
83.1%
CyberGym
Claude Mythos Preview
When a single frontier model scores 94% on SWE-bench Verified, the bottleneck isn’t capability anymore. It’s which agent handles which problem, who reviews, what happens when the rate limit hits, and whether anything you shipped can be audited tomorrow. That’s the layer ModelReins lives in.
Same problem, different rung
Frontier modelsMythos · Opus · GPT · Gemini
Find the bugs. Write the code. Reason about the problem. Get better every quarter.
ModelReinsthe orchestration + governance layer
Route the right model to the right job. Detect caps and fail over. Enforce review gates. Keep an audit trail. Work offline when you need to.
Humansyou and your team
Approve. Own the outcome. Sleep at night.
Live Signal Feed
What it looks like
One command. Your AI workforce is online.
worker — haiku-devbox
$ npx modelreins-worker __ __ _ _ ____ _ | \/ | ___ __| | ___| | _ \ ___(_)_ __ ___ | |\/| |/ _ \ / _` |/ _ \ | |_) / _ \ | '_ \/ __| | | | | (_) | (_| | __/ | _ < __/ | | | \__ \ |_| |_|\___/ \__,_|\___|_|_| \_\___|_|_| |_|___/ by MEDiAGATO Worker: haiku-devbox Provider: anthropic (haiku-4.5) Server: app.modelreins.com Tags: draft,triage,cheap,fast Session: spawn [20:24:01] Ready — waiting for jobs... [20:24:17] >>> Job #803 claimed [20:24:17] Prompt: Write a product description for ModelReins... [20:24:17] Spawning: anthropic-cli "Write a product description..." [20:24:22] <<< Job #803 complete (exit 0, 4.8s) [20:24:27] >>> Job #804 claimed [20:24:27] Prompt: Triage this issue: auth middleware returns 403... [20:24:29] <<< Job #804 complete (exit 0, 1.2s) [20:24:34] Ready — waiting for jobs...|
Google calls it:
"the shift from generative to agentic AI."
We just call it Tuesday.
ModelReins has been orchestrating multi-provider AI workforces while the industry was still writing trend reports about it.
Google Cloud AI Agent Trends 2026Three steps
Patch on
Download the Companion. The wizard finds your Ollama, LM Studio, and any AI tools you already use — then pulls a small local model if you don’t have one yet.
Patch in
Install the Saddle in VSCode. It finds the local Companion automatically over loopback. One keystroke to dispatch.
Dispatch
Type a prompt. The Director picks the right worker — local first, frontier when you need it. Output streams back in real time.
Capabilities
Multi-Agent Dispatch
Route tasks to any registered worker — manual, automatic, or fan-out to multiple workers in parallel.
Smart Routing
Tag workers by capability. The Director matches job complexity and type to the right worker automatically.
Auto-Failover
Rate limited on Opus? Router falls over to Sonnet. Sonnet full? Ollama picks it up locally. Work never stops.
The Saddle
VSCode extension. One keystroke to dispatch from your editor. Output streams back inline. No context switch.
Cross-Machine Sync
Your brain follows you across every machine you own. Context, history, and routing state — everywhere.
Cost Tracking
Per-job spend across every provider. See exactly what each task cost, in real time and in history.
Job Scheduling
Queue jobs to run at a time or on a cron. Overnight builds, morning reports, timed deploys.
Job History
Every output stored and searchable. Full audit log with worker, provider, cost, and duration.
Fleet Awareness
Define your infrastructure in YAML. Workers know what exists, what's healthy, and what's rate-limited.
Killswitch
File, URL, or dead man's switch. Halt all workers instantly. Independent of the server.
Secrets Brokering
Pointers, not passwords. Env vars or Vault. Workers get short-lived tokens, not the keys themselves.
Zero-Knowledge Keys
Your API keys never touch the control plane. Workers fetch credentials locally and talk directly to providers.
Signed Audit Trail
Every action HMAC-signed and logged. Verify integrity. Ship to your SIEM.
Multi-Tenant RBAC
Complete data isolation. Admin, operator, viewer. Teams share one server safely.
AI orchestration used to be a luxury reserved for teams with dedicated ML platform engineers. Solo devs and indie teams — whose code increasingly runs the world — have been left to hand-coordinate agents across a dozen browser tabs. The companion’s free tier is our answer to that equation. Your laptop, your local brain, your workers. No account required to start.
Pricing
Free
$0/mo
- 2 workers · single machine
- Unlimited jobs
- Killswitch
- Dashboard + streaming
- Job chaining
- Cost tracking
Pro
$29/mo
- Unlimited workers
- Unlimited jobs
- Killswitch
- Chain templates
- Cross-machine sync
- Approval gates
- Analytics dashboard
- Priority support
Team
$79/mo
- Unlimited workers
- Unlimited jobs
- Killswitch
- Chain templates
- API access
- Team members
- Multi-user RBAC
- Everything in Pro
Self-Hosted
Free
- Unlimited everything
- Your infrastructure
- Full source (BSL 1.1)
- Commercial license available
Early Adopter
Lifetime Team Access
Everything in Team — forever.
All future features included. No renewals. No price creep.
Limited time — may end without notice.
Your agents. Your infrastructure. Your rules.
Start free. Upgrade when you need more workers.
Enterprise
Self-Hosted. Your Keys. Your Rules.
Run ModelReins on your own infrastructure. Bring your own API keys, keep data in-house, and manage AI workloads across teams with the controls you actually need.
- Self-hosted deployment
- Multi-tenant with role-based access
- Fleet-aware worker routing
- Approval gates and budget controls
- On-prem secrets via HashiCorp Vault
- Priority support and SLA