BrennerBot - NFHN Reader

What's Inside

A research toolkit for applying Brenner's epistemology to your own scientific questions.

Core Workflow

From Question to Conclusion: The Brenner Loop

Research sessions follow a rigorous, reproducible path. Every step is tracked, auditable, and reversible.

Undo / Redo

Every action is reversible. Explore without fear.

Session Replay

Reproduce any session exactly for audit and learning.

Error Recovery

Graceful checkpoints when things go wrong.

Multi-Agent Orchestration

Your Research Team: AI Agents That Debate, Challenge, and Synthesize

Each agent has a precise mandate. Together they sharpen hypotheses, design lethal tests, and merge evidence into auditable artifacts - without surrendering control.

"What if you could have Claude, GPT, and Gemini debate your hypothesis - challenging each other until only the strongest ideas survive?"

Proposition vs opposition with a judge

Best for:

Testing hypothesis strength

Probing questions to surface hidden assumptions

Best for:

Finding weak links fast

Steelman Contest

Debate Mode

Build the strongest case, then dismantle it

Best for:

Exploring the hypothesis space

# Start a debate session
brenner session start --thread-id RS-20260105 \
  --format oxford \
  --question "Does the morphogen gradient model explain cell fate?"

# Watch agents debate in real-time
brenner session status --thread-id RS-20260105 --watch

# See the merged artifact
brenner session compile --thread-id RS-20260105

Coordination Visualization

Deterministic Merge

Thread ID: RS-20260106-001Ack tracking enabled

Kickoff

Threaded prompt goes to each agent role

Deltas

Structured responses return with citations

Merge

Deterministic compiler reconciles evidence

Human

You decide what ships and what dies

Coordination Without Chaos

Agent Mail keeps every exchange auditable

Every message lands in a thread, every response is acknowledged, and every delta is preserved. You stay in the loop with human approval gates at every step.

Built on with thread IDs, ack receipts, and merge-safe deltas.

Kickoff sent3 agents live

Deltas merged1 artifact ready

Human approvalRequired

Research Hygiene

Built-In Guardrails for Rigorous Science

The system blocks common failure modes: hindsight bias, unfalsifiable hypotheses, ignored confounds, and overconfidence. Rigor is enforced before you waste a week.

Coach Mode

Guided checkpoints, inline explanations, and Brenner quotes as you work.

Beginner → ExpertContextual feedback

Prediction Lock

Lock outcomes before results arrive to eliminate hindsight bias.

Immutable predictionsAudit trail

Calibration Tracking

Brier score, overconfidence bias, and domain-level accuracy trends.

Confidence scorecardBias alerts

Confound Detection

Domain-specific confounds flagged with targeted prompting questions.

8 research domainsAutomatic prompts

Artifact Linting

50+ rules enforcing third alternatives, potency controls, and citation hygiene.

Structural checksCitation validation

Prediction Lock Timeline

No hindsight

Confound Detection

8 domains

PsychologyEpidemiologyEconomicsBiologySociologyNeuroscienceComputer ScienceGeneral

Selection bias detected - how will you ensure random sampling?

Reverse causation possible - can you establish temporal order?

Calibration + Linting

Scorecard

Calibration curve (last 10 tests)

Third alternative presentPass

Potency control definedPass

Citation anchorsReview

Discovery & Intelligence

Intelligence Built In: Search, Simulate, Score

Connect to prior work instantly, model evidence impact before you test, and track which hypotheses survive pressure. This is research intelligence, not a chat log.

Hypothesis Similarity Search

Find related work across sessions with offline embeddings and clusters.

Client-side onlyDuplicate detection

What-If Scenarios

Simulate outcomes before running tests and prioritize high-impact experiments.

Info gain rankedScenario builder

Robustness Scoring

Evidence-weighted survival scores reveal fragile vs battle-tested ideas.

Support vs challengeRobustness meter

Anomaly Detection

Track contradictions and spawn new hypotheses instead of burying them.

Anomaly registerParadigm alerts

Query: "morphogen gradient cell fate"

Morphogen gradient (RS-20251230)82%

Statement 0.8 / Mechanism 0.6 / Domain 0.9

Timing gate model (RS-20250112)71%

Statement 0.7 / Mechanism 0.5 / Domain 0.8

Signal relay chain (RS-20241018)64%

Statement 0.6 / Mechanism 0.4 / Domain 0.9

Runs entirely client-side - your hypotheses never leave your machine.

What-If Scenario

Info gain

Starting confidence60%

Expected information gain: 0.42

Best next test: Perturb gradient + checkpoint timing

H1: Morphogen gradient72%

3 supporting / 1 challenging (survived)

H2: Timing mechanism35%

1 supporting / 2 inconclusive

Anomaly Register

Quarantine

X-001Active

Oscillating fate markers

Conflicts with H1 + H2

X-014Deferred

Late-stage inversion

Waiting on potency control

Deep Dive

The Operator Algebra: Brenner's Methods as Executable Code

Sydney Brenner's breakthrough wasn't just his discoveries - it was his method. We've encoded his cognitive patterns as composable operators that you can apply systematically.

The Brenner Method in 4 Steps

Split the levels

Separate the 'what' from the 'how'

Design killing tests

Find experiments that eliminate possibilities

Choose your system

Pick the easiest organism/model to test with

Check the physics

Make sure it's physically possible

Want the precise notation? See the operators below.

⊘

Level-Split

"Separate program from interpreter"

Message vs machine, genotype vs phenotype. Includes the 'chastity vs impotence' diagnostic.

Template

"What is the information? What is the mechanism?"

✂

Exclusion-Test

"Design tests that eliminate, not confirm"

Forbidden patterns: what cannot occur if H is true. Rated by discriminative power.

Template

"If H1 is true, we should NEVER see..."

⟂

Object-Transpose

"Change the system until the test is easy"

Choose organism or model strategically. The experimental object is a design variable.

Template

"What system would make this test cheap and unambiguous?"

⊞

Scale-Check

"Stay imprisoned in physics"

Validate against physical constraints. Calculate timescales, length scales, energy scales.

Template

"Is this physically possible at the relevant scale?"

The Core Composition

(⌂ ∘ ✂ ∘ ≡ ∘ ⊘) powered by (↑ ∘ ⟂ ∘ 🔧) constrained by (⊞) kept honest by (ΔE ∘ †)

- Start from a paradox (◊), split levels (⊘), extract invariants (≡)

- Design exclusion tests (✂), materialize as decision procedure (⌂)

- Power by amplification (↑) in well-chosen system (⟂) you build yourself (🔧)

- Constrain by physics (⊞), keep honest with exception handling (ΔE) and theory killing (†)

Extended Operators6 more patterns

↑

Amplify

Use selection, dominance, regime switches

◊

Paradox-hunt

Use contradictions as beacons

⊕

Cross-domain

Import tools from other fields

∿

Dephase

Work out of phase with fashion

†

Theory-kill

Drop hypotheses when the world says no

⌂

Materialize

What would I see if this were true?

import { pipe } from "@/lib/brenner-loop/operators/framework";

const brennerPipeline = pipe(
  levelSplit,        // Separate levels
  invariantExtract,  // Find what survives
  exclusionTest,     // Design killing experiments
  materialize,       // Compile to decision procedure
);

const result = brennerPipeline(hypothesis, context);

“
I think many fields of science could do a great deal better if they went back to the classical approach of studying a problem, rather than following the latest fashion.

Start the Tutorial