This project is a port of defending-code-reference-harness — Anthropic's Claude Code reference harness for autonomous vulnerability discovery and remediation — adapted to run on GitHub Copilot. It keeps the original's design: a set of interactive skills plus an autonomous, sandboxed pipeline that finds, verifies, reports, and patches memory-safety bugs in C/C++ — but the agent driver is the Copilot CLI (copilot -p), the skills are Copilot agent skills, and auth/egress/tooling target GitHub Copilot instead of the Anthropic API.
Attribution. Ported from
anthropics/defending-code-reference-harness, used under the Apache-2.0 license. SeeNOTICEfor the required attribution and a summary of changes, andLICENSEfor the license. The upstream project demonstrates the methodology with Claude Code; the original write-up and learnings are indocs/blog-post.md. For the complete Claude Code → GitHub Copilot mapping and the as-built record of this port, seePORTING-PLAN.md.
This is a reference, not a product: the general shape, prompts, and sandboxing are reusable, but the harness will not work on every codebase out of the box. Run /customize to port it to your language, detector, or vuln class.
Prerequisites
- GitHub Copilot CLI installed and authenticated:
A GitHub Copilot subscription is required. For headless/automation use, set
npm install -g @github/copilot@1.0.60 # pinned version; see harness/agent_image.py copilot login # or set COPILOT_GITHUB_TOKEN / GH_TOKEN / GITHUB_TOKEN
COPILOT_GITHUB_TOKEN(a fine-grained PAT with the "Copilot Requests" permission, or a Copilot/ghOAuth token). Classic PATs (ghp_) are not supported. - Docker (for the autonomous pipeline) and, on Linux, gVisor (
runsc) for the agent sandbox.scripts/setup_sandbox.shinstalls gVisor and builds the images. - Python 3.11+ for the
vuln-pipelineorchestrator.
Contents
- GitHub Copilot agent skills (
.github/skills/):/quickstart,/threat-model,/vuln-scan,/triage,/patch,/customize— interactive scoping, scanning, triage, and patching. Open this repo with the Copilot CLI and run/quickstartto get oriented. (They're slash-only: each skill setsdisable-model-invocation: true, so Copilot won't auto-fire them.) harness/: the autonomous reference pipeline (recon → find → verify → report → patch), configured for finding C/C++ memory vulnerabilities using Docker and ASAN. Each agent is a headlesscopilot -psession inside its own gVisor container.
⚠️ Security:/quickstart,/threat-model,/vuln-scan, and/triageonly read and write files. Running/patchon static findings (TRIAGE.jsonorVULN-FINDINGS.json) is likewise read- and write-only./customizeedits the harness code and runs validation commands. Any of these skills are safe to run unsandboxed, as long as you review and approve each tool use in the Copilot CLI. The autonomous reference pipeline (including/patchon pipeline results) executes target code, so it refuses to run outside of a gVisor sandbox unless explicitly overridden. To get set up, runscripts/setup_sandbox.shonce, then invoke the pipeline viabin/vp-sandboxed. See docs/security.md and docs/agent-sandbox.md.
Getting Started
git clone <your-repo-url> # the repository you pushed this port to cd defending-code-reference-harness copilot # 30-sec intro + guided first run on the canary target > /quickstart > /quickstart how do I port the pipeline to Java? > /quickstart how do I triage all these bugs?
Further Reading
- Porting plan · The Claude Code → GitHub Copilot mapping + as-built record
- Agent guide · Operator guide (Copilot loads it as custom instructions)
- Blog Post · The original project's write-up: learnings + best practices (provenance)
- Pipeline · How it works: diagram, stages, CLI flags
- Security · Sandboxing, what not to mount
- Agent sandbox · gVisor isolation + egress allowlist for every agent
- Customize · Port to my stack; which files change and why
- Patching · Generate and verify fixes for verified crashes
- Troubleshooting · Duplicates, rate limits, subagent model pinning
Ramp Up
The teams that succeed with this methodology are the ones that get hands-on the fastest. It's tempting to spend months designing the perfect pipeline; instead, start small on Day 1 and build from there as learnings come. The steps below follow that pattern and set an ambitious but reasonable pace.
| Step 1 | Day 1 | Build a threat model and run your first static scan + triage |
| Step 2 | Day 2 | Run the reference pipeline on a C/C++ library |
| Step 3 | Days 3-5 | Customize the pipeline for your target |
| Step 4 | Week 2 | Start autonomous scanning, triage, and patching |
Step 1 (Day 1): Build a threat model and run your first static scan + triage
Day 1 is focused on seeing the whole loop end-to-end. Using only the interactive skills, you'll build a threat model, run a static scan scoped by it, triage what comes back, and draft candidate fixes. You'll finish the day with a threat model, a ranked list of static findings, and candidate patches.
The relevant skills only read and write files in your repo. As long as you run the Copilot CLI interactively and approve each tool use, no sandbox is needed.
# Optional: pin the session model (skill subagents inherit it). export COPILOT_MODEL=claude-sonnet-4.6 # or gpt-5.4, gpt-5.3-codex, gemini-3.1-pro-preview, … copilot # 0. intro + guided first run > /quickstart # 1. Build a threat model (aim before you shoot) > /threat-model bootstrap targets/canary # 2. Run a static scan, scoped by that threat model > /vuln-scan targets/canary # 3. Verify, dedupe, and rank what came back > /triage targets/canary/VULN-FINDINGS.json # 4. Generate candidate fixes for the verified findings > /patch ./TRIAGE.json --repo targets/canary
This flow produces THREAT_MODEL.md, VULN-FINDINGS.{json,md},
TRIAGE.{json,md}, and PATCHES/.
The vulnerability candidates produced in Step 1 come from the model's static review of the source (nothing is built or run), so expect more false positives on any non-canary targets. In Step 2, you'll produce execution-verified findings.
Note: on the canary target,
/triagemay dismiss the scan's findings as false positives.entry.cannounces itself as deliberately vulnerable demo code, and/triagecorrectly excludes bugs in test / fixture code. To see the full confirm / dedupe / false positive flow, run it on the curated fixture instead (/triage .github/skills/triage/fixtures/canary-findings.json --repo targets/canary) or point the Step 1 skills at your own code.
Step 2 (Day 2): Run the reference pipeline on a C/C++ library
On Day 2, you'll move from interactive skills to your first autonomous run using the reference pipeline. You'll run the full recon → find → verify → report loop in your environment on a known-vulnerable open-source library, then generate a candidate patch for what it finds. You'll finish with a set of reproducible crashes, exploitability reports, and candidate patches, along with a feel for how the pipeline works.
Running the pipeline is simple:
# One-time setup python3 -m venv .venv && .venv/bin/pip install -e . ./scripts/setup_sandbox.sh # installs gVisor, builds the agent images, and verifies isolation; needs Docker (Linux host) export COPILOT_GITHUB_TOKEN=github_pat_... # or GH_TOKEN / GITHUB_TOKEN; the pipeline requires one in env # Run the recon → find → verify → report loop bin/vp-sandboxed run drlibs --model claude-sonnet-4.6 --runs 3 --parallel --stream --auto-focus # Generate a candidate patch for each finding bin/vp-sandboxed patch results/drlibs/<timestamp>/ --model claude-sonnet-4.6 # Or, ask the Copilot CLI to launch the pipeline and watch the run for you copilot > run the pipeline on drlibs and explain findings as they come
Results from the loop land in a results/drlibs/<timestamp>/ directory. With
the --stream flag, the first report will appear in minutes under reports/bug_NN/.
⚠️ runspawns autonomous agents. The pipeline runs each agent inside a gVisor container with egress restricted to the GitHub/Copilot API. Agent-spawning subcommands refuse to start outside it unless explicitly overridden. For more information, see docs/security.md and docs/agent-sandbox.md.
Under the hood, the pipeline walks through seven stages:
- Build: Compiles the target into a Docker image with ASAN (the memory
error detector for C and C++). The pipeline builds this image automatically
on first run using the target's
Dockerfile. - Recon: A lightweight agent reads the source inside a network-isolated
container and proposes a partition, i.e., "here are N distinct input-parsing
subsystems worth attacking separately", so that parallel find agents explore
different areas instead of converging on the same bug. Without the
--auto-focusflag, the pipeline uses thefocus_areaslist from the target'sconfig.yaml. - Find: N agents run in parallel, each in its own isolated container. Each agent reads the source, crafts malformed inputs, and runs the ASAN binary until a given input produces a crash 3 out of 3 times.
- Verify: A separate grader agent reproduces each crash in a fresh container that the find agent hasn't touched. The only thing that crosses over from the find agent to the grader is the proof of concept it produced.
- Dedupe: A judge agent compares verified crashes against bugs already reported and decides whether each is a new bug, a better example of a known bug, or a duplicate to skip.
- Report: A report agent writes a structured exploitability analysis per unique bug, including details on primitive class, reachability, escalation path, and severity.
- Patch (the separate patch command above): A patch agent writes a proposed fix, and a grader agent confirms that the new code builds, that the original proof of concept input no longer crashes, that the target's test suite still passes, and that a fresh find agent can't find a way around the fix.
For more details, see docs/pipeline.md.
Step 3 (Days 3-5): Customize the pipeline for your target
On Days 3-5, you'll customize the harness for your own target. First, you'll
point the Step 1 skills at your code, then you'll use /customize to port the
pipeline to your stack. By the end of the week, you'll have a targets/<your-service>/
directory that the pipeline can run against, validated with a single smoke run
of the pipeline, and ready to scale up in Step 4.
While the reference pipeline is designed for finding memory vulnerabilities in C and C++ code, its shape is generic. Porting it to a new vuln class or language just means answering the following questions for your target stack:
| Question | C/C++ Reference | Your target (examples) |
|---|---|---|
| What signals a finding? | ASAN crash signature | exception / canary file / DNS callback |
| What does a proof of concept look like? | crashing input file | HTTP request sequence / tx list / test harness |
| How is the target built and run? | Dockerfile (using clang + ASAN) |
your language's build in a container |
Before customizing, point the Step 1 skills at your own code. As a reminder, they're read- and write-only, so they can run unsandboxed.
copilot > /quickstart how do I customize this for ~/code/my-service? > /threat-model bootstrap-then-interview ~/code/my-service > /vuln-scan ~/code/my-service > /triage ~/code/my-service/VULN-FINDINGS.json --repo ~/code/my-service
Then, use the artifacts produced by those skills in the /customize skill,
which modifies the harness for your codebase.
> /customize use ~/code/my-service/{THREAT_MODEL.md,VULN-FINDINGS.json} and ./TRIAGE.md
When /customize is done, you'll have a targets/my-service/ directory
set up. Validate it with a smoke run of the pipeline before scaling up.
bin/vp-sandboxed run my-service --model claude-sonnet-4.6 --runs 1
For more details, see docs/customizing.md.
Step 4 (Week 2): Start autonomous scanning, triage, and patching
In Week 2, you'll use the pipeline you customized in Step 3 on your own targets, adding an outer loop to the inner pipeline loop - run multiple pipeline scans, triage the findings from across those runs, patch based on prioritization, and repeat.
# Scan - run a wave of parallel runs against your target bin/vp-sandboxed run my-service --model claude-sonnet-4.6 --runs 5 --parallel --stream --auto-focus # Triage - dedupe and rank every finding across all waves using your threat model > /triage results/my-service/ --repo ~/code/my-service --auto --votes 5 # Patch - generate and validate fixes, starting with what triage ranked the highest > /patch results/my-service/<timestamp>/ --model claude-sonnet-4.6
⚠️ Follow the same sandboxing guidelines as in Step 2
A given pipeline run already verifies and deduplicates its own findings.
/triage works across many pipeline runs. When pointed at the results/
directory, it collapses duplicates across all runs (and any static findings
from /vuln-scan if present), recalibrates severity ratings against your
threat model, and attempts to route every finding to the component owner.
When possible, patching findings quickly helps keep the outer loop as
productive as possible. When findings are fixed, the model can't re-find
them, and instead will surface net new, typically deeper issues. As you run
more pipeline waves, the number of findings will likely go down, but the
complexity will likely also go up. If quick patching isn't possible, even
just recording prior findings in the target's known_bugs can help steer
future runs toward newer bugs.
Autonomous triage and patching are still open problems, and this reference
harness doesn't fully solve them. The verification strategies in /patch
help raise the bar, but severity and prioritization are ultimately
judgments about your environment, and verified patches are not always
upstreamable. Budget real engineering time for these steps.
Looking Forward
After the initial ramp up, common directions to invest in:
- Reviewing all internal repos and key open-source dependencies, ranking which are the most important to scan (e.g., based on exposure, history of CVEs, business-criticality), then working through the list in priority order.
- Setting up bespoke infrastructure for scanning to move scans off of laptops or one-off VMs. Resist the urge to build the perfect scanning platform before scaling up.
- Incorporating scans into the SDLC. Some teams set up recurring scans (e.g., daily, weekly) or add scanning into their CI pipelines.
- Testing and experimenting with the available models (Copilot offers Claude, GPT, and Gemini families) to find what works best for each target.