Self-Evolving Cellular Automata — Kai Saksela

We start with a grid of cells that update in discrete steps.

All cells are computed in parallel, but we'll look at the progression from the perspective of a single focal cell (index 1). Each cell interacts with its 8 immediate neighbors, forming a group of 9, indexed as in the image to the left.

At each step $t$ the focal cell has a non-negative energy scalar $E_1^{(t)}\in\mathbb{R}_+$ , a transfer matrix $\mathbf{W}_1^{(t)}\in\mathbb{R}^{9\times 9}$ , and a bias vector $\mathbf{b}_1^{(t)}\in\mathbb{R}^{9}$ . Both $\mathbf{W}$ and $\mathbf{b}$ flow with energy and get updated by incoming energy.

To compute how energy spreads, stack the 9 neighboring energies into $\mathbf{e}^{(t)}$ . We use an affine transformation followed by a nonlinearity to produce a non-negative spread:

$\begin{aligned} \mathbf{z}_1^{(t)} &= \mathbf{W}_1^{(t)}\,\mathbf{e}^{(t)} + \mathbf{b}_1^{(t)}, \\ \mathbf{s}_1^{(t)} &= f\!\left(\mathbf{z}_1^{(t)}\right) + \epsilon. \end{aligned}$

We use rectified linearity, $f(x)=\max(0,x)$ , so the spread is non-negative without constraining $\mathbf{W}$ . The small $\epsilon$ prevents the all-zero case.

$\mathbf{p}^{(t)} = \frac{\mathbf{s}_1^{(t)}}{\sum_{j=1}^{9} s_j^{(t)}},\qquad \mathbf{e}_1^{(t+1)} = E_1^{(t)}\,\mathbf{p}^{(t)}.$

This is done in parallel for all cells, accounting for walls or blocked cells, and summed to get the energy grid at the next step.

The mechanism moves with the energy: the transfer matrix update is proportional to incoming energy flow. Let $\mathbf{e}_{j\to 1}^{(t)}$ denote the energy that arrives to the focal cell from neighbor $j$ . We form a weighted average of the neighboring matrices:

$\tilde{\mathbf{W}}_1^{(t+1)} = \frac{\sum_{j=1}^{9} \left(\mathbf{e}_{j\to 1}^{(t)}\right)^2\,\mathbf{W}_j^{(t)}}{\sum_{j=1}^{9} \left(\mathbf{e}_{j\to 1}^{(t)}\right)^2 + \epsilon}$

We square the energy when weighting, which emphasizes directions with higher energy flow. The bias vector $\mathbf{b}$ is updated in the same way. The same $\epsilon$ keeps the denominator non-zero.

Finally, we inject randomness when energy is low. The idea is that empty or low-energy cells should mutate their mechanisms, while high-energy cells preserve them:

$\mathbf{W}_1^{(t+1)} = \lambda_1^{(t+1)}\,\tilde{\mathbf{W}}_1^{(t+1)} + \beta\,\boldsymbol{\xi}\,\left(1-\lambda_1^{(t+1)}\right),\qquad \boldsymbol{\xi}\sim\mathcal{N}(0,1)$

$\lambda_1^{(t+1)} = \frac{E_1^{(t+1)}}{\beta + E_1^{(t+1)}}$

Here $\beta$ sets the overall noise scale. When $E_1$ is small, $\lambda$ is near zero and randomness dominates; as energy grows, the incoming mechanisms take over. The same blend is applied to $\mathbf{b}$ .

To initialize the system, we sample $\mathbf{W}$ and $\mathbf{b}$ at random and place a chunk of energy in the middle. In the live simulation above, randomness is injected in a controllable region, and mechanisms spread and compete as energy flows.