Stop guessing. Start debugging.
A Claude Code skill that brings hypothesis-driven, instrumented debugging to the terminal. Instead of guessing and patching, Claude Code will systematically reproduce your bug, add tagged logs, analyze runtime behavior, fix the root cause, and write a regression test.
The Problem
AI coding assistants guess at fixes. They see an error, pattern-match to a solution, and hope it works. When it doesn't, they guess again. This is expensive, slow, and produces fragile code.
The Solution
Debug Mode forces a disciplined loop:
Hypothesize → Reproduce → Instrument → Analyze → Reset → Fix → Verify
Every fix is backed by evidence. Every bug gets a regression test. No guessing.
How It Works
The 9-Phase Loop
| Phase | What Happens |
|---|---|
| 0. Safety | Create git branch + stash uncommitted work |
| 1. Discover | Scan project for test infra, scripts, recent changes |
| 2. Hypothesize | Generate 3-5 ranked theories with evidence criteria |
| 3. Reproduce | Write repro script/test and commit it |
| 4. Instrument | Add [DEBUG-MODE] tagged logs (dirtying the working tree) |
| 5. Analyze | Run reproduction, capture logs to file, grep intelligently |
| 6. Reset | git restore . — perfectly removes all debug logs |
| 7. Fix | Implement fix at root cause (not symptom) |
| 8. Verify | Red-to-green check using the committed repro |
The Commit-Then-Instrument Pattern
This is the key insight. LLMs are bad at surgically removing debug logs — they leave hanging brackets, break indentation, and corrupt syntax. Debug Mode solves this:
- Write reproduction test/script
- Commit it (
git commit -m "chore: add failing reproduction") - Add
[DEBUG-MODE]logs everywhere (dirty the working tree) - Analyze the logs
git restore .— instantly removes ALL instrumentation, perfectly- Your committed repro test survives. Your code is clean. Fix the bug.
No manual log deletion. No broken syntax. No cleanup mistakes.
Reproduction Strategies
Claude Code tries these automatically before asking you to do anything:
| Priority | Strategy | When to Use |
|---|---|---|
| 1 | Standalone script | Fastest — bypasses test framework config |
| 2 | Failing unit test | Gold standard when framework is easy to use |
| 3 | curl / HTTP | API bugs |
| 4 | Loop 50x | Race conditions, flaky bugs |
| 5 | Manual | Last resort only |
Hypothesis-Tagged Logging
All debug logs follow a strict format:
[DEBUG-MODE][H1-stale-cache][getUserProfile] cache_key=user:123 hit=true age=3600s
[DEBUG-MODE][H2-race-condition][updateUser] lock_acquired=false retry=1
[DEBUG-MODE]prefix makesgit restore .cleanup bulletproof[H<N>-name]lets you filter by hypothesis:grep '\[H1' debug.log[function]tells you where you are
Terminal Safety
AI agents hang when they start servers without detaching:
# WRONG — hangs the terminal forever npm run dev # RIGHT — detached with PID tracking nohup npm run dev > server.log 2>&1 & SERVER_PID=$! sleep 3 # ... do work ... kill $SERVER_PID
Context Window Protection
AI agents blow their context window on massive log dumps:
# WRONG — 5MB of logs floods context node repro.js # RIGHT — capture to file, query surgically node repro.js > debug.log 2>&1 grep '[DEBUG-MODE]' debug.log grep -C 2 'Exception' debug.log | head -n 50
Red-to-Green Verification
Every debug session ends with proof:
# 1. Run repro — MUST PASS (green) node repro.js # 2. Stash the fix git stash push -m "temp-verify" # 3. Run repro — MUST FAIL (red) node repro.js # 4. Restore git stash pop
If the test passes without the fix, it's a bad test. Rewrite it.
Install
One Command
mkdir -p ~/.claude/skills/debug-mode && \ curl -sL https://raw.githubusercontent.com/franzenzenhofer/debug-mode-skill/main/SKILL.md \ -o ~/.claude/skills/debug-mode/SKILL.md
Manual
- Copy
SKILL.mdto~/.claude/skills/debug-mode/SKILL.md - That's it. Claude Code auto-discovers skills in
~/.claude/skills/.
Verify
Start Claude Code and describe a bug. You should see:
"Entering Debug Mode. I will hypothesize, instrument, reproduce, analyze, fix, and verify."
Bug-Type Cheat Sheet
| Bug Type | Best Strategy |
|---|---|
| Logic error | Standalone script or failing test |
| API response wrong | curl |
| Race condition | Loop 50x |
| State corruption | Repro script |
| Auth/session | Repro script |
| Caching | Failing test |
| Flaky test | Loop 50x |
| CORS/headers | curl -v |
Language Support
Debug instrumentation patterns for:
- JavaScript/TypeScript —
console.error('[DEBUG-MODE]...') - Python —
print('[DEBUG-MODE]...', file=sys.stderr) - Go —
fmt.Fprintf(os.Stderr, "[DEBUG-MODE]...") - Ruby —
$stderr.puts "[DEBUG-MODE]..." - PHP —
error_log("[DEBUG-MODE]...")
All use stderr so debug output doesn't interfere with application output.
Iron Laws
These cannot be bypassed:
- No fixes without reproduction first
- No guessing — instrument and observe
- Never hang the terminal — detach background processes
- Protect your context — logs to file, then grep
- Reset before fixing —
git restore ., never manual edits
License
MIT