· rust
Rust is experimenting with compile-time reflection in the form of type_info (Tracking issue #146922).
It's far from finished, but the other day more than a month ago1 the pull request for Support ADT types in type info reflection landed.
With that PR the types expose enough information about structs to be useful to play around with.
This is what the type info looks like for some types:
u32
Type {
kind: Int(
Int {
bits: 32,
signed: false,
},
),
size: Some(
4,
),
}
bool
Type {
kind: Bool(
Bool,
),
size: Some(
1,
),
}
struct Foo {
a: u32,
}
Type {
kind: Struct(
Struct {
generics: [],
fields: [
Field {
name: "a",
ty: TypeId(0x1378bb1c0a0202683eb65e7c11f2e4d7),
offset: 0,
},
],
non_exhaustive: false,
},
),
size: Some(
4,
),
}
This code snippet shows the structure of several more types.
The full type information is available at compile-time, so in const fn and const { } blocks.
The const functions need to be called in a const context though.
TypeId::info panics if called at runtime.
Whereas currently anything that wants to act on the shape of types usually works as a proc-macro, the new reflection allows const code to do it without external crates nor external types to cooperate, e.g. by also annotating types using the proc-macro.
Building a multi-array list
Inspired by Zig's MultiArrayList I set out to implement a similar type in Rust.
To understand what a multi-array list even is I'll cite the Zig documentation:
A
MultiArrayListstores a list of a struct or tagged union type. Instead of storing a single list of items,MultiArrayListstores separate lists for each field of the struct or lists of tags and bare unions. This allows for memory savings if the struct or union has padding, and also improves cache usage if only some fields or just tags are needed for a computation.
To translate that into Rust code, imagine we have a struct like this:
struct Point {
x: i32,
y: i32,
}
And we create a list of points:
let mut list = Vec::new();
list.push(Point { x: 47, y: 23 });
list.push(Point { x: 0, y: 5 });
In memory this will be laid out roughly like:
┌─ list ───────────────────────┐
│ 2 │ 4 │ 0x5c2c3ef8cae0 │
└───────────────│──────────────┘
│
└─┌─ 0x5c2c3ef8cae0 ──────────┐
│ 47 │ 23 │ 0 │ 5 │
└───────────────────────────┘
The vector holds the length (2), the capacity (42) and a pointer to the actual data allocated on the heap.
On the heap all values are laid out flat one after the other.
To Rust it's known that the Point struct consists of 2 integers, of which it knows the size (32 bit = 4 bytes), and thus a Point is 8 bytes.
Thus we can expect a new Point every 8 bytes3.
For a multi-array list we turn it into this construct instead:
┌─ list ────────────────────────┐
│ 0x6000028bc010 │ 2 │ 2 │
└──│────────────────────────────┘
│
└─┌─ 0x6000028bc010 ────────────────┐
│ 0x6000028bc020 │ 0x6000028bc030 │
└──│───────────────│──────────────┘
│ └───────┐
└─┌─ 0x6000028bc020 ─┐ └─┌─ 0x6000028bc030 ─┐
│ 47 │ 0 │ │ 23 │ 5 │
└──────────────────┘ └──────────────────┘
Our list has a pointer to a list of pointers (0x6000028bc010), a length (2) and a capacity (also 24).
The pointers in that list of pointers point to more lists.
Each of these lists contain the values of just one of the fields of Point.
Or in Rust code5:
struct MultiArrayList {
elems: *mut *mut u8,
cap: usize,
len: usize,
}
This ends up being usable in Rust thanks to reflection6:
let mut points = MultiArrayList::<Point>::new();
let point = Point { x: 47, y: 23 };
points.push(point);
let point: Box<Point> = points.pop().unrwap();
It allows to efficiently iterate over just one of the fields:
for x in points.items::<"x", i32>() {
dbg!(x);
}
Or iterate over the whole list and grab the pieces as we need them:
for point in points.iter() {
let x = point.get::<"x", i32>();
let y = point.get::<"y", i32>();
dbg!(x, y);
}
Now this example doesn't actually show any of the advantages a multi-array list might have. There's no padding we could save.
But I can prove that it does save reduce memory allocation that with the right toppings types:
; cargo run -q --example comparison
Allocating 500 pizzas.
Vec allocated: 16000 bytes
List allocated: 14016 bytes (-12.40%)
The code is available online: https://git.fnordig.de/jer/multi-array-list.
You can also grab it from crates.io:
cargo add multi-array-list
Note though that it's nightly-only, uses several experimental features7.
It is tested and Miri doesn't complain (yet), but I give absolutely no guarantees about it.
I think it might be easy to misuse this to transmute data without resorting to unsafe and there be dragons.
I do not plan to actively maintain it, but maybe I use this as a playground to explore reflection a bit more.
Footnotes:
-
I wrote the code shortly after the pull request landed and started the blog post the same day, but only finished it last night.
-
Rust allocates a bit more up front to avoid many small allocations when it expects to grow later anyway. ↩
-
I'm hand-waving things here a little bit. In theory Rust could choose a different layout here and swap around
xandyin memory. ↩ -
I did not implement over-allocation in my code. ↩
-
Hand-waving even more and skipping the generic part here. ↩
-
Initially I wrote "safe Rust", but the API currently exposed is unsafe to use. I should mark this
unsafe. ↩ -
Notably adt_const_params, which gives me the wonderful possibility of using
"x"in theitemsAPI, but breaks my blog's code highlighting. ↩