Run any ML experiment script across GPUs on multiple cloud providers with a single command.
Autofoundry is a CLI companion to Karpathy's autoresearch. Point it at a shell script, pick your GPU configuration, and it handles the rest: provisioning instances, distributing experiment runs, streaming results live, and producing a final metrics report.
Supported Providers
- RunPod — Secure and Community cloud
- Vast.ai — Global GPU marketplace
- PRIME Intellect — Decentralized GPU network
- Lambda Labs — On-demand cloud GPUs
Quickstart
git clone https://github.com/autofoundry/autofoundry.git
cd autofoundry
uv tool install autofoundryThen run:
On first run, Autofoundry walks you through configuring provider API keys, SSH key path, minimum download bandwidth (default 5000 Mbps — filters out slow Vast.ai hosts), and HuggingFace token. Config is saved to ~/.config/autofoundry/config.toml.
Examples
Run experiments
# Interactive mode — walks you through everything autofoundry run # Run a specific script autofoundry run scripts/run_autoresearch.sh # Specific GPU with multiple experiment runs autofoundry run train.sh --gpu H100 --num 4 # Auto-select cheapest datacenter GPU with 80GB+ VRAM autofoundry run train.sh --segment datacenter --min-vram 80 --auto # Target a specific provider autofoundry run train.sh --segment datacenter --min-vram 80 --provider runpod --auto autofoundry run train.sh --segment datacenter --provider lambdalabs --auto autofoundry run train.sh --segment workstation --provider vastai --auto autofoundry run train.sh --segment datacenter --provider primeintellect --auto # Attach a network volume (RunPod, Lambda Labs) autofoundry run train.sh --volume my-data --provider runpod # Resume a previous session autofoundry run --resume <session-id>
Browse GPU inventory
# Browse all available GPUs across providers autofoundry inventory # Filter by segment, VRAM, or GPU name autofoundry inventory --segment datacenter --min-vram 80 autofoundry inventory --gpu A100
Configure
# Interactive setup for API keys, SSH key, and defaults
autofoundry configManage volumes
# List volumes across providers autofoundry volumes list # Create a new volume autofoundry volumes create --name my-data --provider runpod
Monitor and manage sessions
# Show all sessions autofoundry status # Show a specific session autofoundry status <session-id> # View metrics from most recent run autofoundry results # Terminate instances for a session autofoundry teardown <session-id>
See the full documentation for writing experiment scripts, network volumes, resuming sessions, custom images, CLI reference, and architecture details.
Requirements
- Python 3.11+
- SSH key pair (ed25519 or RSA)
- At least one provider API key (RunPod, Vast.ai, PRIME Intellect, or Lambda Labs)
