Give it a URL. It maps every user journey.
graph TB
subgraph Input
URL[/"URL: abc.com"/]
end
subgraph BrowserLayer["Browser Layer (Decoupled)"]
direction TB
BF["Browser Factory"]
PW["Playwright<br/>(default)"]
CF["Camoufox<br/>(stealth)"]
RC["Real Chrome<br/>(persistent)"]
BF --> PW & CF & RC
end
subgraph Phase1["PHASE 1: DISCOVERY (LLM-Powered)"]
direction TB
subgraph PageAnalysis["Page Analysis Pipeline"]
direction TB
SS["Screenshot + DOM + A11y Tree"]
NM["Network Monitor<br/>(AJAX/Fetch calls)"]
VLM["Claude Vision Analysis"]
AP["Action Plan:<br/>Semantic actions + patterns"]
SS & NM --> VLM --> AP
end
subgraph ActionClassifier["Action Classifier"]
direction TB
NAV["Navigation"]
INT["Interaction"]
PAT["Pattern Instance<br/>(sample 1 of N)"]
end
subgraph Orchestrator["Orchestrator"]
direction TB
JQ["Journey Queue (BFS)"]
FP["State Fingerprinting"]
DD["Dedup / Loop Detection"]
JG["Journey Graph Builder"]
JQ --> FP --> DD --> JG
end
AP --> ActionClassifier
ActionClassifier --> JQ
end
subgraph Persistence["Journey Persistence"]
direction TB
EX["Graph to Replay Exporter"]
JF["Journey JSON Files<br/>(one per journey)"]
DM["discovery-meta.json"]
EX --> JF & DM
end
subgraph Phase2["PHASE 2: REPLAY (LLM-Free)"]
direction TB
RE["Replay Executor<br/>(mechanical)"]
SC["Screenshot Capture<br/>(every step)"]
RR["Replay Results<br/>(pass/fail + metadata)"]
RE --> SC --> RR
end
subgraph Output["Output"]
direction TB
JT["Journey Tree / Graph"]
MERM["Visual Journey Map<br/>(Mermaid)"]
NAR["Natural Language<br/>Journey Descriptions"]
JSON["Structured JSON Export"]
end
URL --> BF
BF -->|"page object"| SS
BF -->|"page object"| NM
JG --> EX
JG --> Output
JF -->|"consumed by"| RE
BF -->|"page object"| RE
style Phase1 fill:#1a1a2e,stroke:#e94560,color:#fff
style Phase2 fill:#1a1a2e,stroke:#00b894,color:#fff
style BrowserLayer fill:#1a1a2e,stroke:#0f3460,color:#fff
style Persistence fill:#1a1a2e,stroke:#6c5ce7,color:#fff
style Output fill:#1a1a2e,stroke:#e94560,color:#fff
style PageAnalysis fill:#16213e,stroke:#0f3460,color:#fff
style ActionClassifier fill:#16213e,stroke:#16213e,color:#fff
style Orchestrator fill:#16213e,stroke:#533483,color:#fff
What It Does
- Point at a URL — the agent explores the site using Claude Vision, understanding each page like a human would
- Discovers full user journeys — Homepage → Product Listing → Product Detail → Cart → Checkout, all found automatically
- Replays journeys mechanically — no LLM needed for replay, making it fast and cheap for regression testing
- Handles the messy web — cookie banners, shadow DOM, CAPTCHAs, pattern collapse (50 product cards → 1 journey step)
- Captures every API call — during replay, records all XHR/fetch requests per step, flags analytics beacons, and outputs a
network-summary.jsonper journey - Structured output — journey graphs, Mermaid diagrams, replay-ready JSON files
Quick Start
git clone https://github.com/apexkid/web-scout-ai && cd web-scout-ai pip install -e . && playwright install export ANTHROPIC_API_KEY=your-key python cli.py auto https://example.com --depth 3
Using conda instead
conda create -p .conda python=3.11 -y .conda/bin/pip install -e . && .conda/bin/playwright install export ANTHROPIC_API_KEY=your-key .conda/bin/python cli.py auto https://example.com --depth 3
How It Works
Two-phase architecture. Discovery uses an LLM (Claude) to explore the site — it takes screenshots, reads the accessibility tree, and decides what to click. Every action builds a journey graph. Once discovery finishes, the graph is exported as standalone JSON files that can be replayed mechanically, with zero LLM calls.
The Element Reference Pattern. Instead of asking the LLM to generate CSS selectors (which it's bad at), we pre-extract every visible interactive element, number them, and let the model pick by reference. The LLM says "click element 7", not "click button.sc-1x2f3y4". This eliminates an entire class of selector hallucination bugs. Read the full write-up →
12-strategy selector waterfall. Each numbered element gets a deterministic CSS selector through a waterfall of 12 strategies — data-testid, ARIA labels, structural paths, and more. If the page has 50 identical product cards, the agent detects the pattern, samples one, and collapses the rest into a single journey step.
Use Cases
- QA engineers — auto-discover every reachable user journey, then replay them as regression tests
- Product managers — visualize the actual journey graph of your site, spot dead ends and loops
- Developers — catch broken flows after deploys without writing a single test script
- Security audits — map all reachable states and transitions from an entry point
CLI Reference
Four commands, each with a one-liner example. Expand for full flag tables.
capture — Snapshot a single page
python cli.py capture https://example.com
Flags
| Flag | Default | Description |
|---|---|---|
--engine |
playwright |
Browser engine to use |
--no-dismiss-overlays |
off | Disable automatic overlay/popup dismissal |
--dismiss-overlays-llm |
off | Use Claude to detect overlays when heuristics fail |
explore — Interactive step-by-step
Human-in-the-loop mode. The agent shows discovered actions, you pick which to take.
python cli.py explore https://example.com
Flags
| Flag | Default | Description |
|---|---|---|
--engine |
playwright |
Browser engine to use |
--no-dismiss-overlays |
off | Disable automatic overlay/popup dismissal |
--dismiss-overlays-llm |
off | Use Claude to detect overlays when heuristics fail |
auto — Fully automated BFS discovery
The main command. Explores breadth-first with a live progress tree.
python cli.py auto https://example.com --depth 4 --branches 3 --mermaid
Flags
| Flag | Default | Description |
|---|---|---|
--depth |
3 |
Max BFS depth |
--branches |
5 |
Max branches explored per page |
--secondary |
off | Also explore secondary-priority actions |
--mermaid |
off | Generate a Mermaid flowchart journey map |
--output PATH |
— | Additional output path for results.json |
--engine |
playwright |
Browser engine to use |
--no-dismiss-overlays |
off | Disable automatic overlay/popup dismissal |
--dismiss-overlays-llm |
off | Use Claude to detect overlays when heuristics fail |
--resume RUN_DIR |
— | Resume an interrupted exploration from a previous run directory |
replay — Re-execute saved journeys
No LLM needed. Replays journeys through a real browser and reports pass/fail per step. By default, every XHR/fetch request is captured and written to a network-summary.json per journey — useful for verifying analytics fires, monitoring third-party API calls, and catching broken endpoints after deploys.
# Replay all journeys for a site python cli.py replay all example.com --headed # Replay a single journey python cli.py replay one output/example.com/auto_20260210_182142/journeys/journey-a-checkout.json
Flags
| Flag | Default | Description |
|---|---|---|
--headed |
off | Run browser in headed (visible) mode |
--engine |
playwright |
Browser engine to use |
--output-dir |
output/ |
Custom output root for replay results |
--wait |
1000 |
Wait time in ms between steps |
--viewport |
1280x800 |
Viewport size as WIDTHxHEIGHT |
--parallel N |
1 |
Run up to N journeys concurrently |
--dismiss-overlays |
off | Enable overlay dismissal during replay |
--dismiss-overlays-llm |
off | Use Claude to detect overlays during replay |
--step-retries N |
0 |
Retry failed steps up to N times |
--no-capture-network |
off | Disable network (XHR/fetch) capture |
Output Structure
output/
└── <domain>/
├── auto_YYYYMMDD_HHMMSS/ # Auto-exploration run
│ ├── graph.json
│ ├── results.json
│ ├── journey_map.md
│ ├── checkpoint.json # Deleted on clean completion
│ ├── screenshots/
│ └── journeys/ # Replay-ready journey files
│ ├── discovery-meta.json
│ └── journey-a-*.json
│
├── capture_YYYYMMDD_HHMMSS/ # Page capture
│ ├── meta.json
│ ├── elements.json
│ ├── elements.tsv
│ ├── a11y_tree.txt
│ └── screenshot.png
│
├── explore_YYYYMMDD_HHMMSS/ # Interactive exploration
│ └── (screenshots + session artifacts)
│
└── replay_YYYYMMDD_HHMMSS/ # Replay results
├── replay-summary.json
└── journey-a-*/
├── replay-result.json
├── network-summary.json
└── step-01-*.png
Advanced Topics
Overlay Dismissal
Cookie consent banners, newsletter popups, chat widgets — the agent auto-dismisses them after every navigation. Two tiers:
| Tier | Method | Cost |
|---|---|---|
| Heuristic | Known CMP selectors (OneTrust, CookieBot, etc.), text-matched buttons, high-z-index close buttons | Free |
| LLM-assisted | Claude Haiku analyzes a screenshot to find dismiss targets | ~$0.002/call |
Heuristic is on by default for discovery, off for replay. LLM tier is opt-in via --dismiss-overlays-llm.
Error Recovery
- LLM API calls: Retried with exponential backoff (up to 4 attempts) on rate limits and server errors
- Page navigation: Retried twice on timeouts and transient network errors
- Click execution: Retried twice on stale elements
- CAPTCHA detection: Cloudflare, reCAPTCHA, hCaptcha, and Turnstile are detected and skipped
- Redirect loop protection: Aborts after 10 redirects
Browser Engines
| Engine | --engine value |
Description |
|---|---|---|
| Playwright Chromium | playwright |
Default headless Chromium |
| Camoufox | camoufox |
Stealth Firefox (pip install camoufox) |
| Real Chrome | real_chrome |
Persistent profile, always headed — retains cookies/logins |
Checkpoint & Resume
During auto exploration, a checkpoint saves every 5 nodes. If interrupted:
python cli.py auto --resume output/example.com/auto_YYYYMMDD_HHMMSS/
Exploration continues exactly where it left off. The checkpoint is deleted on clean completion.
Deep Dive
The core of this agent is the Element Reference Pattern — a technique for grounding vision-language models in the DOM without asking them to generate selectors. It eliminates an entire class of failures that plague browser agents.
Project Structure
src/
analyzer/ # LLM-powered page analysis (screenshot + a11y → actions)
browser/ # Browser engine factory, page capture, action execution
cli/ # Rich live-progress display for the terminal
commands/ # CLI command handlers (capture, explore, auto, replay)
display.py # Display helpers for exploration output
llm/ # Anthropic API client wrapper
models/ # Pydantic data models (actions, graph, journeys)
orchestrator/ # BFS explorer and interactive explorer logic
output/ # JSON/Mermaid export and journey persistence
replay/ # Journey replay executor and output helpers
utils/ # Logging setup and retry helpers
License
MIT