GitHub - utensils/mold: Local AI image generation CLI — FLUX, SD 1.5, SDXL & Z-Image diffusion models on your GPU

5 min read Original article ↗

CI codecov FlakeHub Rust Nix Flake

Generate images and short video clips on your own GPU. No cloud, no Python, no fuss.

Documentation | Getting Started | Models | API

mold run "a cat riding a motorcycle through neon-lit streets"

That's it. Mold auto-downloads the model on first run and saves the image to your current directory.

Install

curl -fsSL https://raw.githubusercontent.com/utensils/mold/main/install.sh | sh

This downloads the latest pre-built binary to ~/.local/bin/mold. On Linux, the installer auto-detects your NVIDIA GPU and picks the right binary (RTX 40-series or RTX 50-series). macOS builds include Metal support.

Other install methods

Nix

nix run github:utensils/mold -- run "a cat"                   # Ada / RTX 40-series
nix run github:utensils/mold#mold-sm120 -- run "a cat"        # Blackwell / RTX 50-series

From source

cargo build --release -p mold-ai --features cuda    # Linux (NVIDIA)
cargo build --release -p mold-ai --features metal   # macOS (Apple Silicon)

Add preview, expand, discord, or tui to the features list as needed.

Manual download

Pre-built binaries on the releases page.

Usage

mold run "a sunset over mountains"                    # Generate with default model
mold run flux-dev:q4 "a turtle in the desert"         # Pick a model
mold run "a portrait" --width 768 --height 1024       # Custom size
mold run "a sunset" --batch 4 --seed 42               # Batch with reproducible seeds
mold run "oil painting" --image photo.png              # img2img
mold run qwen-image-edit-2511:q4 "make the chair red leather" --image chair.png --image swatch.png
mold run ltx-video-0.9.6-distilled:bf16 "a fox in the snow" --frames 25
mold run "a cat" --expand                              # LLM prompt expansion
mold run qwen-image:q2 "a poster" --qwen2-variant q6  # Qwen-Image quantized text encoder
mold run flux-dev:bf16 "portrait" --lora style.safetensors  # LoRA adapter

Inline preview

Display generated images directly in the terminal (requires preview feature):

mold run "a cat" --preview

Generating the mold logo with --preview in Ghostty
Generating the mold logo with --preview in Ghostty

Piping

mold run "neon cityscape" | viu -                     # Pipe to image viewer
echo "a cat" | mold run flux-schnell                  # Pipe prompt from stdin

Terminal UI (beta)

mold TUI — Generate view with image preview
The TUI Generate view with Kitty graphics protocol image preview in Ghostty

Model management

mold list                    # See what you have
mold pull flux-dev:q4        # Download a model
mold rm dreamshaper-v8       # Remove a model
mold stats                   # Disk usage overview
mold clean                   # Clean orphaned files (dry-run)
mold clean --force           # Actually delete

Remote rendering

# On your GPU server
mold serve

# From your laptop
MOLD_HOST=http://gpu-server:7680 mold run "a cat"

See the full CLI reference, configuration guide, and model catalog in the documentation.

Models

Supports 10 model families with 80+ variants:

Family Models Highlights
FLUX.1 schnell, dev, + fine-tunes Best quality, 4-25 steps, LoRA support
Flux.2 Klein 4B and 9B Fast 4-step, low VRAM, default model
SDXL base, turbo, + fine-tunes Fast, flexible, negative prompts
SD 1.5 base + fine-tunes Lightweight, ControlNet support
SD 3.5 large, medium, turbo Triple encoder, high quality
Z-Image turbo Fast 9-step, Qwen3 encoder
Qwen-Image base + 2512 High resolution, CFG guidance, GGUF quant support
Qwen-Image-Edit 2511 Multimodal image editing, repeatable --image, negative prompts
Wuerstchen v2 42x latent compression
LTX Video 0.9.6, 0.9.8 Text-to-video with APNG/GIF/WebP/MP4 output

Bare names auto-resolve: mold run flux-schnell "a cat" picks the best available variant.

See the full model catalog for sizes, VRAM requirements, and recommended settings.

LTX Video

Current supported LTX checkpoints are:

  • ltx-video-0.9.6:bf16
  • ltx-video-0.9.6-distilled:bf16
  • ltx-video-0.9.8-2b-distilled:bf16
  • ltx-video-0.9.8-13b-dev:bf16
  • ltx-video-0.9.8-13b-distilled:bf16

Recommended default today: ltx-video-0.9.6-distilled:bf16.

The 0.9.8 models pull the required spatial-upscaler asset automatically and now run the full multiscale refinement path. mold keeps the shared T5 assets under shared/flux/..., stores the 0.9.8 spatial upscaler under shared/LTX-Video/..., and intentionally continues using the compatible LTX-Video-0.9.5 VAE source until the newer VAE layout is ported.

Features

  • txt2img, img2img, multimodal edit, inpainting — full generation pipeline
  • Image upscaling — Real-ESRGAN super-resolution (2x/4x) via mold upscale, server API, or TUI
  • LoRA adapters — FLUX BF16 and GGUF quantized
  • ControlNet — canny, depth, openpose (SD1.5)
  • Prompt expansion — local LLM (Qwen3-1.7B) enriches short prompts
  • Negative prompts — CFG-based models (SD1.5, SDXL, SD3, Wuerstchen)
  • Pipe-friendlyecho "a cat" | mold run | viu -
  • PNG metadata — embedded prompt, seed, model info
  • Terminal preview — Kitty, Sixel, iTerm2, halfblock
  • Smart VRAM — quantized encoders, block offloading, drop-and-reload
  • Qwen family encoder control — selectable Qwen2.5-VL variants for Qwen-Image and Qwen-Image-Edit, with quantized auto-fallback when BF16 would be too heavy
  • Shell completions — bash, zsh, fish, elvish, powershell
  • REST APImold serve with SSE streaming, auth, rate limiting
  • Discord bot — slash commands with role permissions and quotas
  • Interactive TUI — generate, gallery, models, settings

Deployment

Method Guide
NixOS module Deployment: NixOS
Docker / RunPod Deployment: Docker
Systemd Deployment: Overview

How it works

Single Rust binary built on candle — pure Rust ML, no Python, no libtorch.

mold run "a cat"
  │
  ├─ Server running? → send request over HTTP
  │
  └─ No server? → load model locally on GPU
       ├─ Encode prompt (T5/CLIP text encoders)
       ├─ Denoise latent (transformer/UNet)
       ├─ Decode pixels (VAE)
       └─ Save PNG

Requirements

  • NVIDIA GPU with CUDA or Apple Silicon with Metal
  • Models auto-download on first use (~2-30GB depending on model)