GitHub - franzenzenhofer/debug-mode-skill: Debug Mode for Claude Code — hypothesis-driven, instrumented debugging with automatic bug reproduction and mandatory regression tests

Stop guessing. Start debugging.

A Claude Code skill that brings hypothesis-driven, instrumented debugging to the terminal. Instead of guessing and patching, Claude Code will systematically reproduce your bug, add tagged logs, analyze runtime behavior, fix the root cause, and write a regression test.

The Problem

AI coding assistants guess at fixes. They see an error, pattern-match to a solution, and hope it works. When it doesn't, they guess again. This is expensive, slow, and produces fragile code.

The Solution

Debug Mode forces a disciplined loop:

Hypothesize → Reproduce → Instrument → Analyze → Reset → Fix → Verify

Every fix is backed by evidence. Every bug gets a regression test. No guessing.

How It Works

The 9-Phase Loop

Phase	What Happens
0. Safety	Create git branch + stash uncommitted work
1. Discover	Scan project for test infra, scripts, recent changes
2. Hypothesize	Generate 3-5 ranked theories with evidence criteria
3. Reproduce	Write repro script/test and commit it
4. Instrument	Add `[DEBUG-MODE]` tagged logs (dirtying the working tree)
5. Analyze	Run reproduction, capture logs to file, grep intelligently
6. Reset	`git restore .` — perfectly removes all debug logs
7. Fix	Implement fix at root cause (not symptom)
8. Verify	Red-to-green check using the committed repro

The Commit-Then-Instrument Pattern

This is the key insight. LLMs are bad at surgically removing debug logs — they leave hanging brackets, break indentation, and corrupt syntax. Debug Mode solves this:

Write reproduction test/script
Commit it (git commit -m "chore: add failing reproduction")
Add [DEBUG-MODE] logs everywhere (dirty the working tree)
Analyze the logs
git restore . — instantly removes ALL instrumentation, perfectly
Your committed repro test survives. Your code is clean. Fix the bug.

No manual log deletion. No broken syntax. No cleanup mistakes.

Reproduction Strategies

Claude Code tries these automatically before asking you to do anything:

Priority	Strategy	When to Use
1	Standalone script	Fastest — bypasses test framework config
2	Failing unit test	Gold standard when framework is easy to use
3	curl / HTTP	API bugs
4	Loop 50x	Race conditions, flaky bugs
5	Manual	Last resort only

Hypothesis-Tagged Logging

All debug logs follow a strict format:

[DEBUG-MODE][H1-stale-cache][getUserProfile] cache_key=user:123 hit=true age=3600s
[DEBUG-MODE][H2-race-condition][updateUser] lock_acquired=false retry=1

[DEBUG-MODE] prefix makes git restore . cleanup bulletproof
[H<N>-name] lets you filter by hypothesis: grep '\[H1' debug.log
[function] tells you where you are

Terminal Safety

AI agents hang when they start servers without detaching:

# WRONG — hangs the terminal forever
npm run dev

# RIGHT — detached with PID tracking
nohup npm run dev > server.log 2>&1 &
SERVER_PID=$!
sleep 3
# ... do work ...
kill $SERVER_PID

Context Window Protection

AI agents blow their context window on massive log dumps:

# WRONG — 5MB of logs floods context
node repro.js

# RIGHT — capture to file, query surgically
node repro.js > debug.log 2>&1
grep '[DEBUG-MODE]' debug.log
grep -C 2 'Exception' debug.log | head -n 50

Red-to-Green Verification

Every debug session ends with proof:

# 1. Run repro — MUST PASS (green)
node repro.js

# 2. Stash the fix
git stash push -m "temp-verify"

# 3. Run repro — MUST FAIL (red)
node repro.js

# 4. Restore
git stash pop

If the test passes without the fix, it's a bad test. Rewrite it.

Install

One Command

mkdir -p ~/.claude/skills/debug-mode && \
curl -sL https://raw.githubusercontent.com/franzenzenhofer/debug-mode-skill/main/SKILL.md \
  -o ~/.claude/skills/debug-mode/SKILL.md

Manual

Copy SKILL.md to ~/.claude/skills/debug-mode/SKILL.md
That's it. Claude Code auto-discovers skills in ~/.claude/skills/.

Verify

Start Claude Code and describe a bug. You should see:

"Entering Debug Mode. I will hypothesize, instrument, reproduce, analyze, fix, and verify."

Bug-Type Cheat Sheet

Bug Type	Best Strategy
Logic error	Standalone script or failing test
API response wrong	curl
Race condition	Loop 50x
State corruption	Repro script
Auth/session	Repro script
Caching	Failing test
Flaky test	Loop 50x
CORS/headers	curl -v

Language Support

Debug instrumentation patterns for:

JavaScript/TypeScript — console.error('[DEBUG-MODE]...')
Python — print('[DEBUG-MODE]...', file=sys.stderr)
Go — fmt.Fprintf(os.Stderr, "[DEBUG-MODE]...")
Ruby — $stderr.puts "[DEBUG-MODE]..."
PHP — error_log("[DEBUG-MODE]...")

All use stderr so debug output doesn't interfere with application output.

Iron Laws

These cannot be bypassed:

No fixes without reproduction first
No guessing — instrument and observe
Never hang the terminal — detach background processes
Protect your context — logs to file, then grep
Reset before fixing — git restore ., never manual edits

License

MIT

Author

Franz Enzenhofer