FLAT_ASTS_NOTE.md

2 min read Original article ↗

A Note On Flat Abstract Syntax Trees


Epistemic Status: Quick Note

Date: 2026-02-09


Autark, more specifically MPERA, incorporates an AST implementation to specify, analyze, and later compile programs. On the contrary to predominant advice, I decided to use Flat ASTs instead of traditional Box<T>-based implementations. It later turned out such an approach seems to exhibit multiple advantages with regard to the goal of building a post-modern analytical processing engine.

Box-based ASTs result in a high allocation frequency and randomly located nodes. Flat ASTs pre-allocate memory — preferably once — and result in all nodes being stored linearly in memory. This is the core thesis for using FASTs.

NOTE: Everything specified below is correct as of 9th February 2026. The internal implementation and/or user-facing APIs might have changed since then.

In MPERA, the AST is simply a newtype of a pre-allocated Vec of Ops. There ostensibly exist better implementations, yet the former served the ever-changing codebase well.

https://github.com/ronfriedhaber/autark/blob/main/crates/mpera/src/op/oppool.rs

Programs are constructed with a quasi-builder pattern. Therefore, a node's children will always precede it. The aforementioned enables a major enhancement in the opportunity space — one who wants to implement an interpreter backend can merely linearly iterate over the OpPool. Conversely, a codegen backend can simply linearly dispatch commands in an IR or language of its choosing.

If I recall correctly, Zig uses a similar approach. And a core LuaJIT maintainer attributes much of its performance to linear, pointer-free IR.

The above is a quickly written note, not an argument. Perhaps I will write a more in-depth analysis in the future.