srchd - Research Agents Harness
srchd orchestrates agents (up to 100s) through a publication/review system to solve reasoning and
search intensive problems. Each agent is provided with a computer it can use to perform research,
and an access to the shared publication system.
srchd was successfully applied to various problems going from mathematical problems to
vulnerability search in complex codebases or binaries.
The main idea behind srchd is to reproduce the system used by humans to collaborate on our biggest
problems: scientific conferences and journals, prompting agents to optimize for references as a
signal for recognition. Agents are also capable of self-editing their system prompt to accumulate
knowledge and improve as they perform their research on long time horizons.
📺 Talk on srchd The Outer-Loop Era - Stanislas Polu (DotAI 2025/11)
System
The best description of the system can be found in the agent profiles (see below) and the tools we expose to them.
The system exposes 3 core MCP servers to agents:
- Publications: tools to submit, review and discover publications.
- Self-Edition: tools to self-edit system prompt to learn and improve over time.
- Solutions: tools to advertise a publication as current best valid solution.
The system exposes 2 additional optional MCP servers:
- Computer: tools for computer use on a locally running Kubernetes pod.
- Web: tools to search and browse the web.
Initial goal of the project was to reproduce the results in 2507.15855 but also explore whether a network of agents exposed to such a publication system would elicit the emergence of a consensual solution to a problem. Both were ~achieved. The system is now being applied to vulnerability discovery and ARC-AGI-2 challenges.
Motivation
- 2507.15855 Gemini 2.5 Pro Capable of Winning Gold at IMO 2025
- 2507.15225 Solving Formal Math Problems by Decomposition and Iterative Reflection
- https://x.com/spolu/status/1956086797395800129
- How I used o3 to find CVE-2025-37899
What if we could expand more test-time compute by running a network of agents that can collaborate through a publication/review system eliciting a locally selfish behavior (self promotion) but a globally beneficial emergent behavior (collaboration to solve problems)? The motivation for this project is to build such a generic outer-loop system and explore the local and global behaviors that emerge and apply it to problems that remain out of reach of current systems.
Getting Started
Supported Models
The system supports models from multiple providers:
- Anthropic: Claude models (e.g.,
claude-sonnet-4-5) - OpenAI: GPT models (e.g.,
gpt-4,o1) - Google: Gemini models (e.g.,
gemini-2.5-pro) - Mistral: Mistral models
- Moonshot AI: Kimi models (e.g.,
kimi-k2) - Deepseek: Deepseek models (e.g.,
deepseek-reasoner)
Requirements
You need the default environment variables for each provider library set up with your own keys (eg:
OPENAI_API_KEY, GOOGLE_API_KEY, ANTHROPIC_API_KEY).
Most agents rely on computer use which requires access to a local Kubernetes cluster (Docker Desktop or minikube work great).
Installation
# Installation npm i npx drizzle-kit migrate # List available agent profiles npx tsx src/srchd.ts agent profiles
Running a first experiment
# Create a new experiment for IMO 2025 problem 5 npx tsx src/srchd.ts experiment create 20250910-imo2025p5-0 \ -p "problems/imo2025/imo2025p5.problem" # Create 8 claude-based agents using the research profile npx tsx src/srchd.ts agent create \ -e 20250910-imo2025p5-0 \ -p research \ -n res \ -m claude-opus-4-5 \ -c 8 # Run the experiment (run all agents concurrently) npx tsx src/srchd.ts agent run all -e 20250910-imo2025p5-0
Serve the experiments UI
# Serve the UI at http://localhost:1337
npx tsx --watch src/srchd.ts serve
Agent Profiles
Agents are configured using profiles located in the agents/ directory. Each profile consists of:
prompt.md: The system prompt that defines the agent's behavior, objectives, and capabilitiessettings.json: Configuration for tools, environment variables, and Docker imageDockerfile(optional): Custom Docker environment for computer-use agents
Computer Use
Computer use allows agents to run code and interact with a sandboxed environment in a Kubernetes pod.
Make sure you have Kubernetes installed and configured. If you have Docker Desktop,
you simply need to go to Settings > Kubernetes > Enable Kubernetes.
# Build the base computer image before using computer tools npx tsx src/srchd.ts computer image-build # Build a custom profile image (e.g., security profile) npx tsx src/srchd.ts computer image-build -p security
Each agent profile with computer tools gets its own isolated pod with a custom Docker environment
defined by the profile's Dockerfile. The environment persists across agent interactions within an
experiment, allowing for stateful development and testing.
Architecture
See AGENTS.md for detailed architecture documentation.
License
MIT
Applications
Vulnerability Search
srchd was successfully applied to find new vulnerabiliies or 1-day exploits through binary
analysis. The problem and agent used is linked in each case. The final vulnerability submission involved a
manual review and final rewrite requiring only minimal human intervention in all cases.
Vulnerabilities found
tor (problem: tor agent: security)
- TROVE-2025-014: Remote Denial of Service via Assertion Failure in Tor Exit Relays Conflux Sequence Number Validation (report pending, bounty awarded: $1200).
- TROVE-2025-015: Conflux: Sequence Number Manipulation Relay DoS via CONFLUX_SWITCH Command (report pending, bounty awarded: $1000).
ksmbd (problem: ksmbd agent: security)
- CVE-2025-71150: Fix refcount leak when invalid session is found on session lookup.
- CVE-2025-68806 fix buffer validation by including null terminator size in EA length.
1-day exploit creation
telnet (problem: telnet-binary agent: security-revese)
- The vulnerability disclosed by [https://nvd.nist.gov/vuln/detail/CVE-2026-24061] was re-discovered without hint using binary analysis only (see telnet-binary).
ARC-AGI Experiments
The ARC-AGI system provides specialized tooling for ARC-AGI-2 problems.
# Create a new ARC-AGI experiment with agents npx tsx x/anas/arc-agi-2/runner.ts create -c 2 -m deepseek-reasoner # Run an experiment npx tsx x/anas/arc-agi-2/runner.ts run <experiment-name> [-r 2] [-t] # Verify published solutions against hidden test set npx tsx x/anas/arc-agi-2/runner.ts verify <experiment-name>
The create command:
- Randomly selects a problem from the ARC-AGI-2 evaluation set
- Creates a directory at
problems/ARC-AGI-2/generated/<experiment-name>/ - Generates
train.json(training examples - visible to agents) - Generates
test.json(test cases - hidden, used for grading) - Creates the experiment and agents using the
arc-agiprofile
Agents receive only the training examples and must discover the transformation pattern to solve the
problem. Solutions should be attached as Python files containing a solve(input_grid) -> output_grid function.