CPNs, LLMs, and Distributed Applications

A big theme in LLM-enabled software dev is that verifiable correctness makes it much easier to take bigger leaps with LLMs. E.g. tests, compilers, state machines, etc. While researching for databuild, I recently came across colored petri nets, and instantly saw opportunity.

Colored petri nets (CPNs) are an extension of petri nets. Petri nets are essentially directed bipartite graphs where places can contain tokens, and places are connected by transitions (where the side effects happen). In petri nets, a single token contains no data, and represents an identity-less tokens location in the net. Importantly, petri nets with transitions that have only one input and output and one token globally are equivalent to finite state machines. Colored petri nets extend this model, allowing individual tokens to have data associated with them. This allows CPNs to match closely to the Rust typestate pattern, and suggests Rust may be able to implement CPN semantics easily.

CPNs are of particular interest because a) it is still hard to write concurrent applications and b) provide the potential for formally verifying concurrent programs at build time. Also, if we can implement a high performance datastore underneath, a CPN framework might be able to handle the hard parts of concurrent applications: state synchronization, conflict detection, deadlock avoidance, and coordinating access to shared resources. These are enabled via two other features of CPNs: guards and multi-token consumption/production.

A guard is a list of boolean conditions that may apply to a transition: things that must be true for a token to take that transition. For instance, there need to be more than 0 connections available in the connection pool to claim one. Multi-token consumption/production is just what it sounds like: a fork in the network via a transition, where P1 -> T1 -> (P2, P3), so T1 consuming a token from P1 produces a token in each of P2 and P3 simultaneously. Conversely, a join in the network via a transition, where (P1, P2) -> T1 -> P3, would require that both P1 and P2 to have tokens present that are able to transition via T1 into P3, which will also be simultaneous.

One application that has light concurrency involved is web scraping with leased proxies and scrape targets. You have a limited number of proxies to use to proxy your requests, and need to rate limit your usage of all of them to ensure they don't over-request a given target. Also, you both want to avoid requesting the same target multiple times concurrently, and also avoid making requests to the domain too frequently, to be a responsible user. Traditionally, this is solved with leasing of resources via a central database, implemented via select for update style semantics in the database. We can imagine implementing this with CPN semantics as having the scrape_target transition being the join of the available_proxies and prioritized_targets places, with a scrape only starting when a proxy is available and a prioritized target is available. You can also imagine implementing other complex distributed scraper features with CPN semantics:

Scrape target cooldowns: after being scraped, a target token transitions to a cool_down state, where using a timed petri net allows us to delay transitioning back to prioritized_targets after the delay period.
Domain-level rate limiting: add a separate domains token, so that the scrape_target transition is a 3-way join domains x available_proxies x prioritized_targets.
Retry with backoff: Failed scrapes fork: one token to failed_log, one back to prioritized_targets with incremented retry count and a longer cooldown. Guard prevents retries beyond max attempts.
Result pipeline — Post-scrape, tokens flow through raw_html → parsed → validated → stored, each with its own concurrency limits, which naturally implements back pressure.

Another application is databuild, where the network is made up of partitions (which are effectively leased to job runs), wants, and job runs, and we would benefit from some self-organizing dynamic process to propagate the data deps and solve user stated partition wants in a safe, efficient, and fast manner. But more on that guy later.

I'm still figuring this part out, but it seems like there are a few sensible or interesting strategies for implementing CPNs:

As a traditional app + postgres combo, with transactions implementing simultaneous token movement and state storage, and select for update enabling claiming of tokens for transitions to prevent double moves.
As a single-process Rust binary, with all CPN state stored in memory, enabling move semantics and the typestate pattern to implement the CPN semantics (this could be extremely fast, but need to figure out persistence - maybe snapshotted event log?)

One thing I'm trying to figure out is how to solve the partitiong problem: if app state grows to be too large for 1 server's memory, is there a way to automatically partition the network/tokens to enable fearless concurrency and horizontal scalability? So far the answers seem to be either a) solve in application with archive place/transition being part of the network or b) solving this at the database level. Or maybe something even crazier, where the whole network is made up of multiple CPN apps, which themselves expose query/consume interfaces?

Tying it all back, if we can figure out a CPN app framework with sensible persistence, we could impose more compile-time constraints on agent-authored code (good for dev velocity) while also providing correctness guarantees and simulation capabilities out of the box that unconstrained computation apps don't provide. Exciting!

Appendix

Random Thoughts

!! Is there an existing open source service that I can use the test suite from or benchmark against to validate this? Especially if it means just setting claude code off against it?
Timed petri nets - could we emit metrics by default and simulate the net impact of a change to the whole system?
Join transitions as literal joins in the database, with transition conditions implying indexes, and token transitions as inserts and deletes inside a transaction
What extent are we talking about just making a datastore or CPN database?
Distributed network of CPN apps? Or an app as a distributed network of CPN services? It's all about partitioning?
CPNs don't need explicit re-entrance? Every "step" is a movement of tokens, which is doable completely from the state of all tokens involved?

Potential Bets (LLM Written)

To validate this, we need a real hill to climb: reimplement something with CPN semantics and compare. The core question isn't "can CPNs be fast/correct"—of course they can. It's: does the paradigm make writing simple, correct, and fast concurrent programs easier? Does the formalism pay for itself in reduced bugs and less bespoke coordination code?

The bet: scraper scheduler (spider-rs)

Implement the decision-making core of a scraper—URL prioritization, domain rate limiting, proxy assignment, retry scheduling—using CPNs. Mock the HTTP layer; benchmark decisions/sec and compare coordination code complexity against the original.

Why scrapers? Traditional implementations are full of ad-hoc coordination: rate limit maps, cooldown trackers, retry counters, domain queues. This is exactly what CPNs formalize. Correctness matters (politeness violations get you banned), and timed transitions are central, not incidental. spider-rs is already Rust, so we can fork and compare directly.

Why not the others first?

Option	Why not
Job queue (Faktory)	Coordination is incidental; well-understood pattern where CPN overhead may not pay off
Connection pooler (pgcat)	Too small—might not exercise enough CPN capabilities to be compelling
CI runner (Buildkite)	Too large—incidental complexity (artifacts, logs, shell execution) obscures the coordination story

Implementation approach

In-memory Rust with SQLite snapshots: single-threaded, move semantics apply, extremely fast. Partitioning = separate binaries with disjoint token ownership (enforceable at net-definition time via partition key requirements). Simulation = same CPN execution with mocked effects, enabling correctness testing, failure injection, and time fast-forwarding.