GitHub - deepjoy/shoebox: A Rust application providing S3-compatible object storage backed by local filesystem and SQLite metadata.

A local S3-compatible server for your files. Find duplicates, verify integrity, zero config.

Install

Prerequisites: Docker must be installed for the recommended method. Check with docker --version.

# Docker (recommended)
docker pull ghcr.io/deepjoy/shoebox:latest

# Or via Cargo (no Docker needed)
cargo install shoebox

Quick Start

# Point Shoebox at a directory
shoebox ~/Photos

# Or with Docker
docker run -it --rm -p 9000:9000 -v ~/Photos:/photos ghcr.io/deepjoy/shoebox /photos

# Output:
# Serving 1 bucket on http://localhost:9000
#   photos → /home/user/Photos

Files already on disk appear in S3 immediately — no uploading required. Credentials are generated on first run and printed in the output. To enable browser access (CORS), follow the on-screen instructions — or use the AWS CLI:

# Configure credentials (printed on first run)
aws configure --profile shoebox

# List objects
aws --profile shoebox --endpoint-url http://localhost:9000 s3 ls s3://photos/

Features

S3-compatible API — works with AWS CLI, rclone, and any S3 SDK out of the box
Zero-config startup — just point at directories, no cloud account or configuration needed
Duplicate detection — find and merge duplicate files and directories via content hashing
Integrity verification — scheduled checks to detect bit rot and data corruption
Filesystem sync — background scanning with move detection, real-time file watching
Authentication — AWS Signature V4, per-bucket credentials, pre-signed URLs
Multipart uploads — full support for large file uploads
CORS — browser-based clients work out of the box
Webhook notifications — get notified on object events (put, delete, copy)
Single binary, ~18MB — no runtime dependencies

Duplicate Detection

Shoebox hashes every file (SHA-256) in the background. Finding duplicates is a query:

$ shoebox duplicates ~/Photos --format table

Duplicate groups (2 groups, 5 files, 3 duplicates):

  Hash (SHA-256)       Size   Files
  ─────────────────────────────────────────────
  a]3f…c8d1            32 B   3 copies
    originals/sunset.txt
    backup/sunset.txt        ← duplicate
    edited/sunset-copy.txt   ← duplicate

  7b2e…f104            26 B   2 copies
    originals/mountain.txt
    backup/mountain.txt      ← duplicate

Webapp

A companion browser UI is available at https://deepjoy.github.io/shoebox-webapp/.

Browse buckets, view objects, and see duplicate groups visually — no CLI needed. The webapp talks directly to your local Shoebox server via the S3 API.

CORS setup (required for browser access) — Shoebox prints this command on startup, just copy and run it:

export AWS_ACCESS_KEY_ID='<from startup output>'
export AWS_SECRET_ACCESS_KEY='<from startup output>'
export BUCKET='photos'

curl -X PUT "http://localhost:9000/${BUCKET}?cors" \
  --aws-sigv4 "aws:amz:us-east-1:s3" \
  --user "$AWS_ACCESS_KEY_ID:$AWS_SECRET_ACCESS_KEY" \
  -H "Content-Type: application/json" \
  -d '[{"allowed_origins":["*"],"allowed_methods":["GET","PUT","POST","DELETE","HEAD"],"allowed_headers":["*"],"expose_headers":["ETag","x-amz-request-id"],"max_age_seconds":3600}]'

Who It's For

Developers — test S3 integrations without cloud dependencies, work offline
Home users — expose NAS storage to S3-compatible backup tools, find duplicates with a single query
Archivists — verify file integrity with content hashes, detect bit rot
Privacy-conscious users — keep files local, no account required, no telemetry

Comparison

Concern	Cloud S3	MinIO	SeaweedFS	Garage	Shoebox
Primary strength	Scalability, AWS ecosystem	High performance, enterprise	Small files, high throughput	Simplicity, geo-replication	Existing files, zero config
Best for	Production workloads	AI/ML, large data (TB/PB)	Data lakes, file storage	Edge/distributed, low ops	Local dev, NAS, home lab
Architecture	Managed service	Specialized nodes	Master/volume servers	Homogeneous nodes	Single process
Setup	Account + IAM	Docker + config	Docker + config	Docker + config	Single command
Data location	Cloud	MinIO data dir	SeaweedFS volumes	Garage data dir	Your existing files
File visibility	S3 only	S3 only	S3, FUSE, WebDAV	S3 only	Filesystem + S3
Offline use	No	Yes	Yes	Yes	Yes
Binary size	N/A	~100MB	~40MB	~25MB	~18MB
Duplicate detection	No	No	No	No	Built-in
Integrity checks	Yes (default checksums)	Yes (bitrot healing)	Limited (CRC)	Yes (scrub)	Built-in (scheduled)
Max recommended scale	Unlimited	Petabytes	Petabytes	Petabytes	~10TB

See docs/why-shoebox.md for the full story.

When Not to Use Shoebox

See docs/when-not-to-use-shoebox.md for an honest assessment of limitations, including:

Strong consistency requirements
Distributed / multi-node storage
>10TB of data
Enterprise S3 features (object lock, lifecycle policies, versioning)
High-throughput ingestion (thousands of files/second)

Documentation

Quickstart — Running in 5 minutes
Installation — Docker, cargo install, from source
User Guides — Configuration, credentials, S3 compatibility, and more

Contributing

See CONTRIBUTING.md for development setup and guidelines.

Security

See SECURITY.md for the security model and how to report vulnerabilities.

License

MIT

Disclaimer

Shoebox operates directly on your existing files — it does not copy data into a separate storage directory. S3 operations like DeleteObject and PutObject will modify or remove real files on disk. Back up anything irreplaceable before use. This is pre-1.0 software provided "as is" with no warranty. See LICENSE for details. The authors are not liable for any data loss.

Background

I had 2TB of photos across 3 drives — backups of backups, originals I was afraid to delete. I set out to find duplicate photos and accidentally designed a local S3 server. If an object store knows the content hash of every file, duplicates are just a query. This is a personal project built in public — expect breaking changes before 1.0. If you have thoughts on the approach, open an issue or start a discussion.