GitHub - alexander-turner/alt-text-llm: AI-powered alt text generation and labeling tools for markdown content

alt-text-llm

AI-powered alt text generation and labeling tools for markdown content. Originally developed for my website (repo).

Features

Intelligent scanning - Detects images/videos missing meaningful alt text (ignores empty alt="")
AI-powered generation - Uses LLM of your choice to create context-aware alt text suggestions
Interactive labeling - Manually review and edit LLM suggestions. Images display directly in your terminal
Automatic application - Apply approved captions back to your markdown files

Installation

From PyPI

Automated setup (includes system dependencies)

git clone https://github.com/alexander-turner/alt-text-llm.git
cd alt-text-llm
./setup.sh

Prerequisites

macOS:

brew install imagemagick ffmpeg imgcat
pip install llm

Linux:

sudo apt-get install imagemagick ffmpeg
pip install llm
# imgcat: curl -sL https://iterm2.com/utilities/imgcat -o ~/.local/bin/imgcat && chmod +x ~/.local/bin/imgcat

Usage

The tool provides four main commands: scan, generate, label, and apply.

1. Scan for missing alt text

Scan your markdown files to find images without meaningful alt text:

alt-text-llm scan --root /path/to/markdown/files

This creates asset_queue.json with all assets needing alt text.

2. Generate AI suggestions

Generate alt text suggestions using an LLM:

alt-text-llm generate \
  --root /path/to/markdown/files \
  --model gemini-2.5-flash \
  --suggestions-file suggested_alts.json

Available options:

--model (required) - LLM model to use (e.g., gemini-2.5-flash, gpt-4o-mini, claude-3-5-sonnet)
--max-chars - Maximum characters for alt text (default: 300)
--timeout - LLM timeout in seconds (default: 120)
--estimate-only - Only show cost estimate without generating
--process-existing - Also process assets that already have captions

Cost estimation:

alt-text-llm generate \
  --root /path/to/markdown/files \
  --model gemini-2.5-flash \
  --estimate-only

3. Label and approve suggestions

Interactively review and approve the AI-generated suggestions:

alt-text-llm label \
  --suggestions-file suggested_alts.json \
  --output asset_captions.json

Interactive commands:

Edit the suggested alt text (vim keybindings enabled)
Press Enter to accept the suggestion as-is
Submit undo or u to go back to the previous item
Images display in your terminal (requires imgcat)

4. Apply approved captions

Apply the approved captions back to your markdown files:

alt-text-llm apply \
  --captions-file asset_captions.json

Available options:

--captions-file - Path to the captions JSON file with final_alt populated (default: asset_captions.json)
--dry-run - Preview changes without modifying files

What it does:

Reads approved captions from the captions file
Locates corresponding images/videos in markdown files
Updates alt text for all supported formats:
- Markdown images: ![alt](path)
- HTML img tags: <img src="path" alt="alt">
- Wikilink images: ![[path|alt]]
Preserves file formatting and handles special characters

Example workflow

# 1. Scan markdown files for missing alt text
alt-text-llm scan --root ./content

# 2. Estimate the cost
alt-text-llm generate \
  --root ./content \
  --model gemini-2.5-flash \
  --estimate-only

# 3. Generate suggestions (if cost is acceptable)
alt-text-llm generate \
  --root ./content \
  --model gemini-2.5-flash

# 4. Review and approve suggestions
alt-text-llm label

# 5. Apply approved captions to markdown files
alt-text-llm apply

Configuration

LLM Integration

This tool uses the llm CLI tool to generate alt text. This provides access to many different AI models including:

Gemini (Google) via the llm-gemini plugin
Claude (Anthropic) via the llm-claude-3 plugin
And many more via plugins

Setting up your model

For Gemini models (default):

llm install llm-gemini
llm keys set gemini # enter API key
llm -m gemini-flash-latest "Hello, world!"

For other models:

Install the appropriate llm plugin (e.g., llm install llm-openai)
Configure your API key (e.g., llm keys set openai)
Use the model name with --model flag.

See the llm documentation for setup instructions and the plugin directory for available models.

Output files

asset_queue.json - Queue of assets needing alt text (from scan)
suggested_alts.json - AI-generated suggestions (from generate)
asset_captions.json - Approved final captions (from label)