AgentPitch · LLM Soccer Simulation

6 min read Original article ↗

Open Source · Python 3.11+ · Apache 2.0

LLM-powered soccer simulation where every player on the field is an AI agent running a decide() callback — generated, sandboxed, and evolved by large language models.

PythonJavaScriptRust · OpenAIAnthropicGeminiCustom

A A A A A OPENAI VS ANTHROPIC

What AgentPitch is

A complete soccer simulation engine

LLM Provider Types

Native

OpenAI

gpt-4o · o1 · o3 and latest

Native

Anthropic

claude-3.5 · claude-4 series

Native

Gemini

gemini-1.5 · gemini-2 · flash

OpenAI-Compatible

Custom

DeepSeek · OpenRouter · Ollama · any compatible endpoint

Strategy Language + Sandbox (each language ships its own runtime)

Sandbox

RestrictedPython

Built-in · no install · whitelist builtins

Sandbox

QuickJS

Embedded C engine · pip install .[js]

Sandbox

Wasmtime

Compiles Rust → WASM · pip install .[wasm]

Formation Config (fully configurable)

5 v 5 · default 11 v 11 · preset

Any N v N — set players_per_team: N in YAML config.

Core Design · The decide() Interface

03 / 13

Every player is a single function

The decide( ) interface

Inputs (per tick)

game_state

Field positions · Ball location
Scores · Phase · Tick number

player_state

Position · Speed · Role
Stamina · Player ID

history

Recent actions · Last 5 decisions
Reward signal from PMEP

LLM-Generated Code

decide (

game_state,
player_state,
history

)

Python · JavaScript · Rust

Runs in sandbox · 100ms timeout

Compiled + cached between ticks

Returns (Action)

Move→ dx, dy vector

Pass→ target_player_id

Shoot→ goal direction

Tackle→ opponent_id

Hold→ stay in place

def decide(game_state, player_state, history) -> Action:  # Generated by LLM · evolved by PMEP after each match

How a match runs

Config → code → sandbox → play → evolve

01 · Config

YAML / API

Teams · LLM providers · Match settings

02 · CGP

Code Gen Pipeline

LLM writes decide() · Jinja2 prompt · 3 compile retries

03 · Sandbox

Compile + Cache

RestrictedPy · QuickJS · Wasmtime · 100ms timeout

04 · TickEngine

Match Simulation

Snapshot → Execute → Resolve → Physics → Log

05 · PMEP

Post-Match Evolution

LLM improves strategy from match log · top 5 events

evolved strategy fed back into next match

AgentPitch · Architecture

Act II · 05 / 13

Act II

Under the Hood

Four clean layers. Hard boundaries between the API surface and the simulation engine. Real sandboxes that protect the host from arbitrary LLM-generated code.

Layer Architecture

06 / 13

Four layers · dependency: foundation → api · upper imports lower

04
TOP

API Layer

FastAPI (HTTP server) · React (browser UI) · SSE (server-sent events, live stream) · Pydantic (independent API models)

03

Orchestration Layer

TE (TickEngine) · ARE (ActionResolutionEngine) · CLI (command-line runners: season / cup / league)

02

Core Layer

GSM (GameStateManager) · PMS (PlayerMovementSystem) · BPS (BallPhysicsSystem) · MLS (MatchLogSystem)

01
BASE

Foundation Layer

PAL (Provider Abstraction Layer) · SF (SandboxFactory) · CGP (Code Generation Pipeline) · PMEP (Post-Match Evolution Pipeline) · ARE (ActionResolutionEngine) · GSS (Game State Schema)

Upper layers import lower — not the reverse

TickEngine · Per-Tick Resolution Pipeline

07 / 13

What happens inside every single game tick

7 tick phases

Snapshot Collection

Capture full game state from GSM — positions, scores, ball, phase, tick index.

GSM

Action Generation

Invoke each player's decide() in sandbox. Failures routed to FallbackHandler.

Sandbox

Validation & Normalization

Cooldown checks, Move speed clamping, Pass/Shoot power capping, Tackle target validation.

Validate

Player Movement

Compute-all-then-commit. PMS resolves Move actions, player separation, dribble contests.

PMS

Ball Actions

Pass & Shoot resolution. Set ball velocity, landing zone, skill-based deviation, transfer possession.

Ball

Tackle Resolution

Range check, possession verification, strength-based success probability per Tackle action.

Contest

Ball Physics & Goal

BPS advances ball. Goal-line crossing, scoring, and goalkeeper save attempts resolved.

BPS · Goal

Phases execute sequentially · results merged into action_records dict · logged to MatchLogSystem ~100ms timeout per decide()

Strategy Runtime · Language & Provider Support

08 / 13

Write code · run in a sandbox · evolved after every match

Three strategy languages

LanguageSandbox BackendSecurity Model
Python RestrictedPython Whitelist builtins · no imports · exec isolated
JS JavaScript QuickJS (embedded C engine) Isolated JS runtime · no Node APIs
Rust → WASM Wasmtime WASM sandbox · memory-isolated · AOT compiled

LLM Providers (via PAL — Provider Abstraction Layer)

openai/

OpenAI

gpt-4o · o1 · o3

anthropic/

Anthropic

claude-3.5 · claude-4

google/

Gemini

gemini-1.5 · gemini-2

deepseek/

DeepSeek

deepseek-v3 · v4

openrouter/

OpenRouter

Any hosted model

local/

Ollama

Fully offline · no API key

Safety Guarantee

LLM-generated code never touches the host filesystem, network, or OS — it lives entirely inside the sandbox with a 100ms execution budget per tick.

Browser UI · Live Match Viewer

09 / 13

Real-time simulation · browser on port 8765

Live match viewer

Live field view

Field · Live ViewSVG canvas · real-time

5v5 field in SVG. Player positions and ball updated every tick via SSE stream.

Event feed

Event FeedColor-coded by type

Goal · Shot · Pass · Tackle events in scrollable log. Click any event to scrub to that tick.

Match statistics

Stats PanelPost-match breakdown

Possession · shots · passes · tackles per player. Tabular monospace numerics.

AgentPitch · Tournament Modes

Act III · 09 / 13

Act III

Tournaments

Three structured formats for AI competition. Every match feeds back into strategy evolution — the longer the tournament, the smarter the agents become.

Arena Mode · Head-to-Head Evolution

11 / 13

Arena · two LLMs · configurable match series · default 3 matches · one evolving rivalry

LLM A vs LLM B · evolving across matches

Team A

deepseek/deepseek-v4-flash · Python strategy

Team B

openai/gpt-4o · Python strategy

Start · CGP

Initial
Strategies

LLM generates
decide() for both

Match 1

First encounter

A 2 — 1 B

Strategy v1

PMEP · Evolve

Post-Match
Evolution

Both strategies
improved from log

Match 2

Evolved play

A 3 — 1 B

Strategy v2

Match 3

Final — evolved²

A 2 — 0 B

Strategy v3

Arena match running

Arena complete

Tournament Formats · Cup & League

12 / 13

Cup Mode · Single-Elimination Bracket

Team A OpenAI gpt-4o Team B Anthropic claude Team C Gemini 1.5 Team D DeepSeek v4 SEMI-FINAL 1 A wins 2-1 SEMI-FINAL 2 C wins 3-2 FINAL A Champion

CLI: agent-pitch cup-run

League Mode · Round-Robin Standings

#TeamPWDLGFGAPts
1Alpha641114713
2Bravo632111811
3Charlie62139137
4Delta61055113

League standings UI

CLI: agent-pitch league-run

Quick start · links · star us

Get started in 30 seconds

Quick Run · Docker

$ docker run -p 8756:8756 ghcr.io/gangtao/agentpitch:0.1.0

Opens the browser UI on http://localhost:8756 — no install required.