GitHub - afreidah/s3-orchestrator: Multi-backend S3-compatible proxy with quota management, replication, and rebalancing

5 min read Original article ↗

s3-orchestrator

CI Coverage Quality Gate License: MIT

Project Website · Documentation · Maximizing Free-Tier Storage

Put one S3-compatible endpoint in front of multiple S3 backends. The orchestrator tracks where every object lives in PostgreSQL (or embedded SQLite), enforces per-backend byte quotas, replicates objects across clouds on a configurable factor, and gives operators real primitives — drain, rebalance, integrity scrub, online failover — instead of pushing that work onto every client.

Add as many S3-compatible backends as you want — OCI Object Storage, Backblaze B2, AWS S3, MinIO, Wasabi, anything that speaks S3 — and the orchestrator presents them as one or more virtual buckets. Cap each backend at the byte limit you choose to stack free-tier allocations into one larger logical bucket without surprise bills. Set a replication factor and every object lands on N providers automatically.

Who this is for

Audience Use case
Homelabbers Stack free-tier allocations from multiple providers into usable storage without paying for a single plan.
Self-hosters running MinIO Add automatic cloud backups to a local MinIO instance with one config change — no sync scripts or extra tooling.
Small teams and startups Multi-cloud redundancy and encryption without the cost or complexity of enterprise storage platforms.
Anyone wanting provider independence Applications talk S3 to one endpoint — swap, add, or remove backends without touching a line of code.

What's in the box

  • A metadata layer that knows. Every object's backend placement, replica set, quota delta, and orphan bytes live in a real database. Failover reads, degraded-mode broadcast on DB outage, drain, rebalance, and integrity scrub all key off it — none require backend-side coordination.
  • Per-backend quotas + multi-cloud replication, configured side-by-side. Stack a 10 GB OCI free tier, a 5 GB B2 free tier, and a 20 GB AWS cap into one 35 GB logical bucket. Replicate every object across two of them. Both are operator-configurable and hot-reloadable.
  • Operations-grade plumbing. Circuit breakers (per-backend + per-DB), bounded degraded-read broadcast with parallelism caps, online drain with progress reporting, online rebalance, PUT-before-COMMIT pending intents, durable cleanup queue with DLQ, envelope encryption (AES-256-GCM, Vault Transit), integrity scrub + content-hash backfill, Prometheus + OpenTelemetry, admin API, web UI.

What else is out there

If you've gone looking for a tool that does something similar, there don't appear to be many options:

Project What it is Why it's not the same
rclone union remote Client-side multi-remote stacking Per-client config, no server endpoint, no central drain/rebalance/quota enforcement
MinIO Gateway Was a multi-backend S3 proxy Deprecated in 2022
Flexify.IO Commercial multi-cloud S3 SaaS Closed source; $0.03/GiB SaaS or $0.09/hr self-hosted
gaul/s3proxy, oxyno-zeta/s3-proxy S3 API translation / routing proxies Single backend at a time, or multi-bucket routing without quotas, replication, or a metadata layer

Quickstart

Prerequisites: Go 1.26+, Docker, Make.

git clone https://github.com/afreidah/s3-orchestrator.git
cd s3-orchestrator
make run

Starts three MinIO backends via Docker Compose, embedded SQLite as the metadata store, and the orchestrator on localhost:9000.

aws --endpoint-url http://localhost:9000 s3 cp /etc/hostname s3://photos/test.txt
aws --endpoint-url http://localhost:9000 s3 ls s3://photos/

Default credentials: access key photoskey, secret photossecret. Web dashboard at localhost:9000/ui/ (login admin / admin).

Full credentials and troubleshooting: docs/quickstart.md.

Install

Channel Source
Container docker pull ghcr.io/afreidah/s3-orchestrator:<version>
Debian / Ubuntu .deb from GitHub Releases
Static binary Linux / macOS / Windows from GitHub Releases
From source git clone && make build

Database: SQLite is embedded — no external dependencies for single-instance use. PostgreSQL 14+ is also an option and is required for multi-instance deployments (database.driver: postgres); the schema migrates on boot.

Generate a config interactively: s3-orchestrator init.

Verify release artifacts

Container images and release checksums are signed with cosign (keyless / Sigstore):

# Container image
cosign verify ghcr.io/afreidah/s3-orchestrator:<version> \
  --certificate-identity-regexp='github\.com/afreidah/s3-orchestrator' \
  --certificate-oidc-issuer='https://token.actions.githubusercontent.com'

# Release checksums
cosign verify-blob checksums.txt --bundle checksums.txt.bundle \
  --certificate-identity-regexp='github\.com/afreidah/s3-orchestrator' \
  --certificate-oidc-issuer='https://token.actions.githubusercontent.com'

Architecture in 30 seconds

              S3 clients (aws cli, rclone, etc.)
                          |
                          v
                    +-----------+
                    | S3 Orch.  |  <-- SigV4 auth, rate limiting, quota routing
                    +-----------+
                     |         |
            +--------+         +------------------+------------------+
            v                  v                  v                  v
       PostgreSQL        OCI Object         Backblaze B2          AWS S3
       (metadata)       Storage (20 GB)       (10 GB)             (5 GB)
                              \                  |                  /
                               '------------ 35 GB total ---------'

Metadata (object locations, quota counters, multipart state, cleanup queue) lives in PostgreSQL or SQLite. Backends only ever see plain S3 calls — no orchestrator-specific protocol, no schema requirements. Any provider that speaks the AWS SDK works.

Deeper details: docs/architecture.md.

Documentation

Topic Doc
First-run / demo Quickstart
S3 client setup User Guide
Architecture docs/architecture.md
Configuration walkthrough + hot-reload docs/configuration.md
Authentication (SigV4, tokens, multi-bucket) docs/authentication.md
Backends, quotas, routing strategies docs/backends.md
Database engines, schema, migrations docs/database.md
Replication, over-replication, orphan reconciliation docs/replication.md
Cleanup queue, lifecycle expiry, pending intents docs/cleanup-and-lifecycle.md
Envelope encryption, Vault Transit docs/encryption.md
Operations (drain, rebalance, scrub, cache, trace) docs/operations.md
Monitoring (Prometheus, OTel, audit log) docs/monitoring.md
Background services reference docs/background-services.md
Webhook notifications docs/notifications.md
CLI subcommands docs/cli.md
UI + Admin API JSON endpoints docs/api-reference.md
Deployment (Nomad, Kubernetes, Docker) docs/deployment.md
Security hardening docs/security-hardening.md
Performance tuning docs/performance-tuning.md
Disaster recovery docs/disaster-recovery.md
Version migration docs/version-migration.md
Benchmark trends Live charts · scheduled runs
Coding conventions docs/style-guide.md
Build / test / contribute CONTRIBUTING.md

Contributing

Contributions welcome. Start with CONTRIBUTING.md for the build / test / submit workflow, and docs/style-guide.md for the codebase's conventions.

License

MIT