████████╗██████╗ ██╗██████╗ ██╗ ███████╗ ██████╗██╗ ██╗███████╗ ██████╗██╗ ██╗
╚══██╔══╝██╔══██╗██║██╔══██╗██║ ██╔════╝██╔════╝██║ ██║██╔════╝██╔════╝██║ ██╔╝
██║ ██████╔╝██║██████╔╝██║ █████╗ ██║ ███████║█████╗ ██║ █████╔╝
██║ ██╔══██╗██║██╔═══╝ ██║ ██╔══╝ ██║ ██╔══██║██╔══╝ ██║ ██╔═██╗
██║ ██║ ██║██║██║ ███████╗███████╗╚██████╗██║ ██║███████╗╚██████╗██║ ██╗
╚═╝ ╚═╝ ╚═╝╚═╝╚═╝ ╚══════╝╚══════╝ ╚═════╝╚═╝ ╚═╝╚══════╝ ╚═════╝╚═╝ ╚═╝
Three AI agents review your code — for free, on your hardware.
Deep review, not shallow lint. Local LLMs = unlimited passes, zero cost.
Comparison · Quick Start · How It Works · Configuration · Roadmap
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ │ │ │ │ │ │ │
│ REVIEWER │──────► │ CODER │──────► │ TESTS │──────► │ JUDGE │
│ │ │ │ │ │ │ │
│ finds │ │ fixes │ │ verifies │ │ scores │
│ bugs │ │ code │ │ nothing │ │ quality │
│ │ │ │ │ broke │ │ 0 — 10 │
└──────────┘ └──────────┘ └──────────┘ └──────────┘
│ │
│ ◄── loop until clean ── │
└───────────────────────────────────────┘
| Feature | ||
|---|---|---|
| 💰 | $0 API cost | Run Qwen, DeepSeek, or Llama locally via vLLM / Ollama / LM Studio |
| 🔀 | Mix any models | Local Qwen for reviewer + cloud Claude for judge — any combination |
| 🔧 | Real fixes, not lint | Produces actual code patches, applied and verified each round |
| 📦 | Scan entire repos | Auto-splits large codebases into review units, prioritizes by complexity |
| 🌐 | Language agnostic | Python, Go, Rust, TypeScript, Java, and more |
| 🧠 | Multi-pass voting | N review passes × different angles → vote to filter noise |
| 🔌 | Any LLM backend | vLLM, Ollama, LM Studio, OpenRouter, DeepSeek, OpenAI, Claude |
🏆 Comparison
How triplecheck stacks up against popular AI code review tools:
| triplecheck | CodeRabbit | PR-Agent (Qodo) | Sourcery | Ellipsis | |
|---|---|---|---|---|---|
| Open source | ✅ MIT | ❌ SaaS | ✅ Apache-2.0 | ❌ Freemium | ❌ SaaS |
| Run locally / self-host | ✅ | ❌ | ✅ | ❌ | ❌ |
| Use your own models | ✅ Any LLM | ❌ Fixed backend | ❌ Fixed | ❌ Fixed | |
| $0 with local LLMs | ✅ | ❌ $24/mo | ❌ | ❌ $36/mo | |
| Auto-fix code | ✅ Coder agent writes patches | ❌ Suggestions only | ❌ Suggestions only | ✅ Implements fixes | |
| Review → Fix → Test loop | ✅ Multi-round | ❌ Single pass | ❌ Single pass | ❌ Single pass | ❌ Single pass |
| Judge / scoring | ✅ 0–10 verdict | ❌ | ❌ | ❌ | ❌ |
| Multi-pass voting | ✅ N passes, deduplicate | ❌ | ❌ | ❌ | ❌ |
| Layered review | ✅ arch/interface/logic/security | ❌ | ❌ | ❌ | ❌ |
| CI test gate | ✅ Auto-runs tests | ❌ | ❌ | ❌ | ❌ |
| Repo-wide scan | ✅ Auto-split + resume | ❌ PR-scoped | ❌ PR-scoped | ❌ PR-scoped | ❌ PR-scoped |
| Tree-sitter dep graph | ✅ Smart batching | ❌ | ❌ | ❌ | ❌ |
| GitHub PR integration | 🔜 Roadmap | ✅ | ✅ | ✅ | ✅ |
| Incremental (diff-only) | 🔜 Roadmap | ✅ | ✅ | ✅ | ✅ |
| PR summary | 🔜 Roadmap | ✅ | ✅ | ✅ | ✅ |
| IDE extension | 🔜 Roadmap | ✅ VS Code | ❌ | ✅ VS Code | ❌ |
| In-PR chat | ❌ | ✅ @coderabbit | ✅ /ask | ❌ | ❌ |
| SAST integrations | ✅ 40+ tools | ❌ | ❌ | ||
| Learning from feedback | ❌ | ✅ | ❌ | ✅ | ❌ |
TL;DR — triplecheck has the deepest review engine (multi-round fix loop, voting, layered review, test gate, judge scoring) and is the only tool that runs 100% free on your own hardware. The gap is GitHub integration — coming soon.
🚀 Quick Start
Install
pip install triplecheck
# Optional: smart file grouping via tree-sitter dependency graph
pip install triplecheck[graph]3 ways to run
|
🏠 Local (free) # Start any OpenAI-compatible server vllm serve Qwen/Qwen3-Coder # Review! triplecheck \ --target ./my-project \ --skip-ci No API keys. No cost. Unlimited. |
☁️ Cloud API export DEEPSEEK_API_KEY=sk-...
triplecheck \
--target ./my-project \
--skip-ciFast setup, pay per token. |
🔀 Hybrid (recommended) # Local finds + fixes, cloud judges
triplecheck \
--target ./my-project \
--reviewer qwen-local \
--coder qwen-local \
--judge claude-opus \
--skip-ciBest quality at minimal cost. |
See
examples/for complete config files:config.local.yml·config.hybrid.yml·config.cloud.yml
⚙️ How It Works
┌─────────────────────────────────────────────────────────────┐
Your Code │ Review Pipeline │
│ │ │
▼ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
┌─────────┐ │ │ Reviewer │───▶│ Coder │───▶│ Tests │──▶ Round N │
│ Discover│──▶ Batch ───▶│ │ (LLM) │ │ (LLM) │ │ (local) │ │ │
│ Files │ │ └──────────┘ └──────────┘ └──────────┘ │ │
└─────────┘ │ ▲ │ │ │
│ └──────── more findings? ◀───────┘ │ │
│ converged? │
│ │ │
│ ┌──────────┐ │
│ │ Judge │ │
│ │ (LLM) │ │
│ └────┬─────┘ │
└───────────────────────────────────────────────────┼────────┘
│
▼
📄 Report (JSON + MD)
Score: 8.5/10 ✅
The Loop
- Reviewer reads your code in batches, outputs structured findings (file, line, severity, fix suggestion)
- Coder receives each finding, writes the actual fix (full file output), or rejects false positives with reasoning
- Tests run automatically (pytest, go test, npm test, cargo test) — if they fail, the round stops
- Repeat until no new findings or max rounds reached
- Judge evaluates the entire session history and scores 0–10
Concepts
| Concept | What it is |
|---|---|
| Finding | A single issue: file, line, severity, suggested fix |
| Batch | A group of related files sent to the Reviewer in one call |
| Round | One full Reviewer → Coder → Tests cycle |
| Session | A complete review (multiple rounds until convergence) |
| Unit | A logical module in scan mode (package/directory) |
| Scan | Full repo review — splits into units, runs sessions, aggregates |
🔧 Configuration
All config lives in config.yml — three sections:
# ── 1. Define available models ────────────────────────────────────── models: qwen-local: provider: openai-compat base_url: http://localhost:8000 # vLLM / Ollama / LM Studio model: Qwen/Qwen3-Coder max_tokens: 16384 # coder needs room for full files temperature: 0.1 claude-opus: provider: claude-cli model: opus deepseek: provider: openai-compat base_url: https://api.deepseek.com model: deepseek-coder api_key_env: DEEPSEEK_API_KEY # reads from environment variable # ── 2. Assign roles → models (swap these freely) ─────────────────── assignments: reviewer: qwen-local # fast local model finds issues coder: qwen-local # fast local model writes fixes judge: claude-opus # strong model scores quality # ── 3. Pipeline behavior ─────────────────────────────────────────── pipeline: max_rounds: 4 # max review-fix iterations batch_max_lines: 800 # lines per review batch severity_threshold: warning # error | warning | suggestion auto_style_fix: true # run ruff/black after fixes
Providers
| Provider | Config key | Compatible with |
|---|---|---|
| OpenAI-compatible | openai-compat |
vLLM, Ollama, LM Studio, DeepSeek, OpenRouter, OpenAI, any /v1/chat/completions |
| Claude CLI | claude-cli |
Claude Opus, Sonnet, Haiku via claude CLI |
| Codex CLI | codex-cli |
OpenAI Codex via codex CLI |
📦 Scan Mode
Review entire repositories — triplecheck auto-splits into logical units:
# Preview the plan (instant, no LLM calls) triplecheck --target ~/work/big-repo --scan --plan-only --include "**/*.go" # Review top 5 most complex modules triplecheck --target ~/work/big-repo --scan --max-units 5 --skip-ci \ --include "**/*.go" --exclude "vendor/*" # Resume a crashed scan triplecheck --target ~/work/big-repo --scan --resume <scan_id> --skip-ci
🎯 Review Modes
Three mutually exclusive strategies — pick one:
| Single Pass (default) | Multi-Pass + Vote | Layered Review |
|---|---|---|
|
One pass. Fast. pipeline: review_passes: 1 Best for small projects. |
N passes × different angles. Findings voted on. Noise filtered. pipeline: review_passes: 3 review_min_votes: 2 only_high_confidence: true Best with free local models — run 5 passes, let votes filter noise. |
Each layer sees only relevant context. Non-overlapping coverage. pipeline: review_layers: - architecture - interface - logic - security Best for large codebases. |
Stackable Enhancements
Enable on top of any review mode:
| Feature | Config | What it does | |
|---|---|---|---|
| 🔍 | Static Analysis | static_analysis: true |
Pre-screens with ruff / golint / eslint |
| 🧠 | Cross-Round Knowledge | knowledge_accumulation: true |
Extracts recurring patterns → injects into next round |
| 🌳 | Smart Grouping | smart_grouping: true |
Tree-sitter dep graph batches related files together |
🤖 Supported Models
| Model | Provider | Speed | Quality | Notes |
|---|---|---|---|---|
| Qwen3-Coder | openai-compat | ⚡⚡⚡ | ★★★★ | Best free option. Set max_tokens ≥ 16384. |
| DeepSeek Coder | openai-compat | ⚡⚡⚡ | ★★★★ | Cloud API, very cheap |
| Llama 3.3 70B | openai-compat | ⚡⚡ | ★★★ | Needs ~40GB VRAM |
| Claude Opus | claude-cli | ⚡ | ★★★★★ | Best as judge in hybrid setups |
| Claude Sonnet | claude-cli | ⚡⚡ | ★★★★ | Good all-rounder |
| GPT-4o | openai-compat | ⚡⚡ | ★★★★ | Via OpenAI or OpenRouter |
Tip: Any model that serves
/v1/chat/completionsworks. The table above is just what we've tested.
💡 Local LLM Tips
| Tip | Details |
|---|---|
| Recommended model | Qwen3-Coder or DeepSeek-Coder for reviewer/coder roles |
| Token budget | Set max_tokens ≥ 16384 for coder — it outputs full file contents |
| Thinking tags | If your model emits <think>...</think>, triplecheck auto-strips them |
| NL fallback | If JSON parsing fails, findings are extracted from natural language |
| vLLM flags | --max-model-len 32768 --enable-prefix-caching for best throughput |
📋 CLI Reference
| Flag | Description |
|---|---|
--target PATH |
Project directory to review (required) |
--config PATH |
Config file path (default: ./config.yml) |
--reviewer MODEL |
Override reviewer model |
--coder MODEL |
Override coder model |
--judge MODEL |
Override judge model |
--max-rounds N |
Max review rounds |
--include PATTERN |
File glob include (repeatable) |
--exclude PATTERN |
File glob exclude (repeatable) |
--skip-tests |
Exclude test files from review |
--ci-cmd COMMAND |
Custom test command |
--skip-ci |
Skip test gate entirely |
--batch-max-lines N |
Max lines per review batch |
--output PATH |
Report output directory (default: ./reports/) |
--scan |
Split repo into units and review each |
--plan-only |
Show scan plan only, no LLM calls |
--max-units N |
Review top N units by priority |
--resume SCAN_ID |
Resume a previous scan |
📄 Output
Reports are saved to ./reports/:
| File | Contents |
|---|---|
<session_id>.json |
Full session state — all rounds, findings, fixes, verdict |
<session_id>.md |
Human-readable report with findings table, fixes, test results, judge verdict |
scan_<id>.json |
All unit sessions combined (scan mode) |
scan_<id>.md |
Overview table, per-unit summaries, aggregate score (scan mode) |
🔌 Adding a Provider
# triplecheck/providers/my_provider.py from triplecheck.providers.base import BaseProvider class MyProvider(BaseProvider): def review(self, files, prompt, **kwargs): ... # → list[Finding] def fix(self, file, findings, prompt, **kwargs): ... # → FixResult def judge(self, session, prompt, **kwargs): ... # → Verdict
Then register in triplecheck/roles.py → PROVIDER_MAP and add model entries in config.yml.
🗺️ Roadmap
P0 — Next Up
- GitHub PR integration — GitHub Action + post review comments via
gh api, line-by-line annotations - Incremental diff-only review — parse
git diff, send only changed lines + context to LLM (saves tokens, more precise) - PR summary / walkthrough — auto-generate a changelog-style summary for each review session
P1 — On Deck
- GitHub Action template — drop-in
.github/workflows/triplecheck.ymlfor any repo - SARIF output —
--format sariffor GitHub Code Scanning / Security tab integration - Repo-level config —
.triplecheck.ymlauto-discovered in repo root - Ignore rules —
.triplecheck-ignoreto suppress known false positives by pattern
P2 — Future
- VS Code extension — trigger review from IDE, show findings inline
- Web report viewer — interactive HTML report with filtering and navigation
- GitLab / Bitbucket support — platform-agnostic PR integration
- Semgrep integration — custom SAST rules alongside LLM review
- Learning from feedback — track dismissed findings, auto-suppress recurring false positives
Have an idea? Open an issue or send a PR.