A local S3-compatible server for your files. Find duplicates, verify integrity, zero config.
Install
Prerequisites: Docker must be installed for the recommended method. Check with docker --version.
# Docker (recommended) docker pull ghcr.io/deepjoy/shoebox:latest # Or via Cargo (no Docker needed) cargo install shoebox
Quick Start
# Point Shoebox at a directory shoebox ~/Photos # Or with Docker docker run -it --rm -p 9000:9000 -v ~/Photos:/photos ghcr.io/deepjoy/shoebox /photos # Output: # Serving 1 bucket on http://localhost:9000 # photos → /home/user/Photos
Files already on disk appear in S3 immediately — no uploading required. Credentials are generated on first run and printed in the output. To enable browser access (CORS), follow the on-screen instructions — or use the AWS CLI:
# Configure credentials (printed on first run) aws configure --profile shoebox # List objects aws --profile shoebox --endpoint-url http://localhost:9000 s3 ls s3://photos/
Features
- S3-compatible API — works with AWS CLI, rclone, and any S3 SDK out of the box
- Zero-config startup — just point at directories, no cloud account or configuration needed
- Duplicate detection — find and merge duplicate files and directories via content hashing
- Integrity verification — scheduled checks to detect bit rot and data corruption
- Filesystem sync — background scanning with move detection, real-time file watching
- Authentication — AWS Signature V4, per-bucket credentials, pre-signed URLs
- Multipart uploads — full support for large file uploads
- CORS — browser-based clients work out of the box
- Webhook notifications — get notified on object events (put, delete, copy)
- Single binary, ~18MB — no runtime dependencies
Duplicate Detection
Shoebox hashes every file (SHA-256) in the background. Finding duplicates is a query:
$ shoebox duplicates ~/Photos --format table
Duplicate groups (2 groups, 5 files, 3 duplicates):
Hash (SHA-256) Size Files
─────────────────────────────────────────────
a]3f…c8d1 32 B 3 copies
originals/sunset.txt
backup/sunset.txt ← duplicate
edited/sunset-copy.txt ← duplicate
7b2e…f104 26 B 2 copies
originals/mountain.txt
backup/mountain.txt ← duplicateWebapp
A companion browser UI is available at https://deepjoy.github.io/shoebox-webapp/.
Browse buckets, view objects, and see duplicate groups visually — no CLI needed. The webapp talks directly to your local Shoebox server via the S3 API.
CORS setup (required for browser access) — Shoebox prints this command on startup, just copy and run it:
export AWS_ACCESS_KEY_ID='<from startup output>' export AWS_SECRET_ACCESS_KEY='<from startup output>' export BUCKET='photos' curl -X PUT "http://localhost:9000/${BUCKET}?cors" \ --aws-sigv4 "aws:amz:us-east-1:s3" \ --user "$AWS_ACCESS_KEY_ID:$AWS_SECRET_ACCESS_KEY" \ -H "Content-Type: application/json" \ -d '[{"allowed_origins":["*"],"allowed_methods":["GET","PUT","POST","DELETE","HEAD"],"allowed_headers":["*"],"expose_headers":["ETag","x-amz-request-id"],"max_age_seconds":3600}]'
Who It's For
- Developers — test S3 integrations without cloud dependencies, work offline
- Home users — expose NAS storage to S3-compatible backup tools, find duplicates with a single query
- Archivists — verify file integrity with content hashes, detect bit rot
- Privacy-conscious users — keep files local, no account required, no telemetry
Comparison
| Concern | Cloud S3 | MinIO | SeaweedFS | Garage | Shoebox |
|---|---|---|---|---|---|
| Primary strength | Scalability, AWS ecosystem | High performance, enterprise | Small files, high throughput | Simplicity, geo-replication | Existing files, zero config |
| Best for | Production workloads | AI/ML, large data (TB/PB) | Data lakes, file storage | Edge/distributed, low ops | Local dev, NAS, home lab |
| Architecture | Managed service | Specialized nodes | Master/volume servers | Homogeneous nodes | Single process |
| Setup | Account + IAM | Docker + config | Docker + config | Docker + config | Single command |
| Data location | Cloud | MinIO data dir | SeaweedFS volumes | Garage data dir | Your existing files |
| File visibility | S3 only | S3 only | S3, FUSE, WebDAV | S3 only | Filesystem + S3 |
| Offline use | No | Yes | Yes | Yes | Yes |
| Binary size | N/A | ~100MB | ~40MB | ~25MB | ~18MB |
| Duplicate detection | No | No | No | No | Built-in |
| Integrity checks | Yes (default checksums) | Yes (bitrot healing) | Limited (CRC) | Yes (scrub) | Built-in (scheduled) |
| Max recommended scale | Unlimited | Petabytes | Petabytes | Petabytes | ~10TB |
See docs/why-shoebox.md for the full story.
When Not to Use Shoebox
See docs/when-not-to-use-shoebox.md for an honest assessment of limitations, including:
- Strong consistency requirements
- Distributed / multi-node storage
- >10TB of data
- Enterprise S3 features (object lock, lifecycle policies, versioning)
- High-throughput ingestion (thousands of files/second)
Documentation
- Quickstart — Running in 5 minutes
- Installation — Docker, cargo install, from source
- User Guides — Configuration, credentials, S3 compatibility, and more
Contributing
See CONTRIBUTING.md for development setup and guidelines.
Security
See SECURITY.md for the security model and how to report vulnerabilities.
License
MIT
Disclaimer
Shoebox operates directly on your existing files — it does not copy data into a separate storage directory. S3 operations like DeleteObject and PutObject will modify or remove real files on disk. Back up anything irreplaceable before use. This is pre-1.0 software provided "as is" with no warranty. See LICENSE for details. The authors are not liable for any data loss.
Background
I had 2TB of photos across 3 drives — backups of backups, originals I was afraid to delete. I set out to find duplicate photos and accidentally designed a local S3 server. If an object store knows the content hash of every file, duplicates are just a query. This is a personal project built in public — expect breaking changes before 1.0. If you have thoughts on the approach, open an issue or start a discussion.
