GitHub - rcarmo/go-joker: A personal twist on the original Clojure interpreter and linter, slightly mad, Go-ing places

An optimized fork of Joker (Clojure-like Lisp interpreter) for inclusion in gi, a self-hosted coding agent.

Performance

Cross-language benchmark matrix

vs. original Joker

Highlights

What	Result
Arithmetic loop via WASM	~0.26 ms — matches Bun/JSC-class speed, >700× faster than original
Recursive fib	~0.96 ms — WASM/IR path, >500× faster than original
Map update loop	~0.90 ms — IR + transient maps, ~19× faster than the previous IR path
Word frequency	~7.7 ms — IR + maps, ~36× faster than original
Joker beats Goja on	arithmetic, tail recursion, recursive fib, pidigits, regex-redux, map-update-style core workloads

What's different from upstream Joker

IR bytecode interpreter (26 opcodes)

Hot loops and functions compile to a flat bytecode that runs in a stack-machine interpreter, avoiding the overhead of tree-walking evaluation, interface dispatch, and per-call allocation.

WASM/wazero native compilation

Pure numeric loops compile further to WASM bytecode and execute via wazero's native code compiler. This achieves JIT-level performance (matching Bun/JSC) with zero CGo dependencies.

Generic tail-call optimization

Self-recursive functions in tail position are automatically rewritten to recur at parse time, eliminating stack growth. A runtime trampoline handles cases the rewriter can't catch.

Transient vectors and maps

Loops that update non-escaping vectors or maps via assoc automatically use in-place mutation (Clojure-style transients), eliminating persistent copy/update overhead while preserving persistent results at loop return.

Evaluator fast paths

Numeric operations, binding resolution, and function dispatch all have type-specialized fast paths that avoid the generic Joker evaluation machinery.

Architecture

Joker Source → Reader + Parser → AST
                                  ↓
                           tco_rewrite (parse-time tail-call → recur)
                                  ↓
                              Eval() type switch
                                  ↓
                    ┌─────────────┼─────────────┐
                    ↓             ↓             ↓
              WASM/wazero    IR bytecode    Tree-walker
              (native)       (irExec)      (evalLoop)
              0.32ms ⚡       28ms           190ms
                    ↑             ↑
                    └──fallback───┘

WASM path: pure integer/float loops → wazero JIT → native code
IR path: loops with collections, fn calls, let bindings → bytecode interpreter
Tree-walker: everything else (macros, special forms, I/O)
gi bridge: hooks, tools, state access — callable from IR via irCallSlot

Building & testing

go test ./core              # run all tests
go test ./core -bench .     # run all benchmarks

Benchmarks

Note: The CLBG programs were chosen as a starting point for optimizing the IR and WASM compilation pipeline, not because they represent realistic workloads. They stress specific interpreter bottlenecks (arithmetic loops, recursion, allocation, string processing) that guided the optimization work. Real-world gi scripts will have different profiles — the gains here prove the execution machinery works, not that every Joker program runs 500× faster.

# Full CLBG suite + micro benchmarks
go test ./core -run '^$' -bench 'BenchmarkCLBG|BenchmarkEval|BenchmarkWasm' -benchmem -benchtime=5x

# Cross-language comparison
python3 benchmarks/cross_lang_bench.py
bun benchmarks/cross_lang_bench.js

# Regenerate charts
go run ./benchmarks/generate_svg.go ./benchmarks

Documentation

docs/OPTIMIZATION_REPORT.md — full technical report (phases, trade-offs, outcomes, suggested git history)
benchmarks/README.md — benchmark data and chart regeneration
PERFORMANCE_PLAN.md — optimization roadmap and milestones

Upstream

Based on candid82/joker v1.7.1.
Original README preserved as ORIGINAL_README.md.

License

Same as upstream Joker (EPL-1.0).