-
Understanding a fast, elegant RTree implementation
Kyle Barron
January 8, 2025
Source code.Spatial indexes are at the core of geospatial software engineering. Given a spatial query (“What items are within this bounding box“ or “What are the closest items to this point”), they allow for weeding out the vast majority of data, making a search massively faster than naively checking all items.
An RTree is one of the most common types of spatial indexes. An RTree indexes axis-aligned bounding boxes, and so can flexibly manage a variety of geospatial vector data, like points, lines, and polygons (by recording the bounding box represented by the minimum and maximum extents of a geometry’s coordinates).
But ever wondered how an RTree is actually implemented?
In this post we’ll dive into the implementation of Flatbush, a blazing-fast, memory-efficient RTree written in JavaScript by Volodymyr Agafonkin. While this implementation is written in JavaScript, it’s the algorithm that’s important here. Don’t get too caught up in the JavaScript; it should be easy to follow no matter what language you’re most familiar with.
I ported Flatbush to Rust with Python bindings, and this post is the result of my efforts to better understand and document how the algorithm works.
This post is a “literate” fork of the upstream Flatbush library. I’ve added comments to the code, and docco is used to generate the HTML file you’re reading now. Documentation and code are interspersed, letting you follow along with the code. No code modifications have been made in this fork; only comments have been added. The source for this fork is here.
All credit for this code included here goes to Volodymyr Agafonkin and other contributors to the Flatbush project, forked here under the ISC license. Any errors in explanation are mine alone.
Overview
The Flatbush algorithm generates a static, packed, ABI-stable RTree. Let’s break that down:
RTree: a spatial index for storing geospatial vector data that allows for fast spatial queries.
It’s a form of a “tree”. There’s one root node that has
nodeSizechildren. Each of those nodes have their ownnodeSizechildren, and so on. The tree structure allows you to avoid superfluous checks and quickly find matching candidates for your query. In particular, an RTree stores a bounding box for each geometry.static: the index is immutable. All geometries need to be added to the index before any searches can be done. Geometries can’t be added to an existing index later.
packed: all nodes are at full capacity (except for the last node at each tree level). Because the tree is static, we don’t need to reserve space in each node for future additions. This improves memory efficiency.
ABI-stable: the entire tree is stored in a single underlying memory buffer, with a well-defined, stable memory layout. This enables zero-copy sharing between threads (Web Workers in the browser) or, as in my Rust port, between two languages like Rust and Python.
Why Flatbush?
There are several nice features about Flatbush:
- Speed: This is likely the fastest static spatial index in JavaScript. Ports of the algorithm are among the fastest spatial indexes in other languages, too.
- Single, contiguous underlying buffer: The index is contained in a single
ArrayBuffer, which makes it easy to share across multiple threads or persist and use later. In the process of building the index, there are only two buffer allocations: one for the main data buffer and a second intermediate one for the hilbert values. - Memory-efficiency: because the index is fully packed, it’s highly memory efficient.
- Bounded-memory: for any given number of items and node size, you can infer the total memory that will be used by the RTree.
- Elegant and concise: Under 300 lines of JavaScript code and in my opinion it’s quite elegant how the structure of the tree implicitly maintains the insertion index.
- Used as the basis for other projects, like the FlatGeobuf geospatial file format.
What’s not to like? Keep in mind there are a few restrictions:
- Only two-dimensional data. Because the algorithm uses powers of two, only two-dimensional data is supported. It can be used with higher-dimensional input as long as you only index two of the dimensions.
- The index is immutable. After creating the index, items can no longer be added or removed.
Buffer layout
All bounding box and index data is stored in a single, contiguous buffer, with three parts:
- Header: an 8-byte header containing the coordinate array type, node size, and number of items.
- Boxes: the bounding box data for each input geometry and intermediate tree nodes.
- Indices: An ordering of boxes to allow for traversing the tree and retrieving the original insertion index.
Diving into the code
import FlatQueue from "flatqueue";
-
Flatbush supports a variety of
TypedArraytypes to store box coordinate data. Flatbush usesFloat64Arrayby default.const ARRAY_TYPES = [ Int8Array, Uint8Array, Uint8ClampedArray, Int16Array, Uint16Array, Int32Array, Uint32Array, Float32Array, Float64Array, ];
-
The Flatbush serialized format version is bumped whenever the binary layout of the index changes
-
Flatbush
The
Flatbushclass is the only export from the Flatbush library. It contains functions to create and query the spatial index.export default class Flatbush {
-
Flatbush.from
One of Flatbush’s goals is to support zero-copy usage, meaning that you can take an
ArrayBufferbacking a Flatbush index and transfer it between threads at virtually zero cost.The
fromstatic method on the class reconstructs aFlatbushinstance from a rawArrayBuffer.static from(data) { if (!data || data.byteLength === undefined || data.buffer) { throw new Error( "Data must be an instance of ArrayBuffer or SharedArrayBuffer." ); }
-
The first 8 bytes contain a header:
- byte 1: a “magic byte” set to
0xfb. - byte 2: four bits for the serialized format version and four bits for the array type used for storing coordinates
- byte 3-4: a uint16-encoded number representing the size of each node
- byte 5-8: a uint32-encoded number representing the total number of items in the index.
We read each of these bytes from the provided data buffer, then pass the relevant parameters to the class constructor. Because the
dataargument (passed last) is notundefined, the constructor will not create a new underlying buffer, but rather reuse the existing buffer.const [magic, versionAndType] = new Uint8Array(data, 0, 2); if (magic !== 0xfb) { throw new Error("Data does not appear to be in a Flatbush format."); } const version = versionAndType >> 4; if (version !== VERSION) { throw new Error(`Got v${version} data when expected v${VERSION}.`); } const ArrayType = ARRAY_TYPES[versionAndType & 0x0f]; if (!ArrayType) { throw new Error("Unrecognized array type."); } const [nodeSize] = new Uint16Array(data, 2, 1); const [numItems] = new Uint32Array(data, 4, 1); return new Flatbush(numItems, nodeSize, ArrayType, undefined, data); }
- byte 1: a “magic byte” set to
-
Constructor
The Flatbush constructor initializes the memory space (
ArrayBuffer) for a Flatbush tree given the number of items the tree will contain and the number of elements per tree node.constructor( numItems, nodeSize = 16, ArrayType = Float64Array, ArrayBufferType = ArrayBuffer, data ) { if (numItems === undefined) throw new Error("Missing required argument: numItems."); if (isNaN(numItems) || numItems <= 0) throw new Error(`Unexpected numItems value: ${numItems}.`); this.numItems = +numItems; this.nodeSize = Math.min(Math.max(+nodeSize, 2), 65535);
-
This do-while loop calculates the total number of nodes at each level of the R-tree (and thus also the total number of nodes). This will be used to allocate space for each level of the tree.
The tree is laid out in memory from bottom (leaves) to top (root).
_levelBoundsis an array that stores the offset within the coordinates array where each level ends. The first element of_levelBoundsisn * 4, meaning that the slice of the coordinates array from0ton * 4contains the bottom (leaves) of the tree.Then the slice of the coordinates array from
_levelBounds[0]to_levelBounds[1]represents the boxes of the first level of the tree, that is, the direct parent nodes of the leaves. And so on,_levelBounds[1]to_levelBounds[2]represents the nodes at level 2, the grandparent nodes of the leaf nodes.So for example if
numItemsis 10,000 andnodeSizeis 16,levelBoundswill be:[40000, 42500, 42660, 42672, 42676]That is:
- The first 40,000 elements (10,000 nodes) are coordinates of the leaf nodes (4 coordinates per node).
- 2,500 coordinates and 625 nodes one level higher
- 160 coordinates and 40 nodes two levels higher
- 12 coordinates and 3 nodes three levels higher
- 1 root node four levels higher, at the top of the tree, with a single 4-coordinate box.
Keep in mind that because this is a packed tree, every node within a single level will be completely full (contain exactly
nodeSizeelements) except for the last node.numNodesends up as the total number of nodes in the tree, including all leaves.let n = numItems; let numNodes = n; this._levelBounds = [n * 4]; do { n = Math.ceil(n / this.nodeSize); numNodes += n; this._levelBounds.push(numNodes * 4); } while (n !== 1);
-
Flatbush doesn’t manage references to objects directly. Rather, it operates in terms of the insertion index. Flatbush only maintains these insertion indices.
IndexArrayTypewill be used to create theindicesarray, to store the ordering of the input boxes. If possible, aUint16Arraywill be used to save space. If the values would overflow aUint16Array, aUint32Arrayis used. The largest number aUint16Arraycan hold is2^16 = 65,536. Since each node holds four values, this gets divided by4and65,536 / 4 = 16,384. This is why the check here is for 16,384.this.ArrayType = ArrayType; this.IndexArrayType = numNodes < 16384 ? Uint16Array : Uint32Array;
-
In order to accurately interpret the index from raw bytes, we need to record in the header which index type we’re using.
const arrayTypeIndex = ARRAY_TYPES.indexOf(this.ArrayType);
-
The number of bytes needed to store all box coordinate data for all nodes.
const nodesByteSize = numNodes * 4 * this.ArrayType.BYTES_PER_ELEMENT; if (arrayTypeIndex < 0) { throw new Error(`Unexpected typed array class: ${ArrayType}.`); }
-
This
ifstatement switches on whether thedataargument was passed in (i.e. this constructor is called byFlatbush.from). Ifdataexists, this will create the_boxesand_indicesarrays as views on the existingArrayBufferwithout allocating any new memory.if (data && data.byteLength !== undefined && !data.buffer) { this.data = data;
-
this._boxesis created as a view onthis.datastarting after the header (8 bytes) and withnumNodes * 4elements.this._indicesis created as a view onthis.datastarting after the end ofthis._boxesand containingnumNodeselements.this._boxes = new this.ArrayType(this.data, 8, numNodes * 4); this._indices = new this.IndexArrayType( this.data, 8 + nodesByteSize, numNodes );
-
The coordinate data in the
_boxesarray is stored from the leaves up. So the last box is the single node that contains all data. The index of the last box is the four values in_boxesup tonumNodes * 4.This sets the total bounds on the
Flatbushinstance to the extent of that box.We also set
this._posas the total number of coordinates.this._posis a pointer into thethis._boxesarray, used while adding new boxes to the instance. This also allows for inferring whether theFlatbushinstance has been “finished” (sorted) or not.If the instance has already been sorted, adding more data is not allowed. Conversely, if the instance has not yet been sorted, query methods may not be called.
this._pos = numNodes * 4; this.minX = this._boxes[this._pos - 4]; this.minY = this._boxes[this._pos - 3]; this.maxX = this._boxes[this._pos - 2]; this.maxY = this._boxes[this._pos - 1];
-
In the
elsecase, adatabuffer was not provided, so we need to allocate data for the backing buffer.this.datais a newArrayBufferwith space for the header plus all box data plus all index data. Thenthis._boxesis created as a view onthis.datastarting after the header and withnumNodes * 4elements.this._indicesis created as a view onthis.datastarting after the end ofthis._boxes.} else { this.data = new ArrayBufferType( 8 + nodesByteSize + numNodes * this.IndexArrayType.BYTES_PER_ELEMENT ); this._boxes = new this.ArrayType(this.data, 8, numNodes * 4); this._indices = new this.IndexArrayType( this.data, 8 + nodesByteSize, numNodes );
-
We set
this._posto 0. This means that no boxes have yet been added to the index, and it tells any query methods to throw untilfinishhas been called. -
The RTree needs to maintain its total bounds (the global bounding box of all values) in order to set the bounds for the hilbert space.
We initialize these bounds to
Infinityvalues that will be corrected when adding data. The minimum x/y of any box will be less than positive infinity and the maximum x/y of any box will be greater than negative infinity. Theadd()call will adjust these bounds if necessary.this.minX = Infinity; this.minY = Infinity; this.maxX = -Infinity; this.maxY = -Infinity;
-
Next we set the header values with metadata from the instance.
The first byte,
0xfbis a “magic byte”, used as basic validation that this buffer is indeed a Flatbush index.Since
arrayTypeIndexis known to have only 9 values, it doesn’t need to take up a a full byte. Here it shares a single byte with the Flatbush format version.new Uint8Array(this.data, 0, 2).set([ 0xfb, (VERSION << 4) + arrayTypeIndex, ]); new Uint16Array(this.data, 2, 1)[0] = nodeSize; new Uint32Array(this.data, 4, 1)[0] = numItems; }
-
We initialize a priority queue used for k-nearest-neighbors queries in the
neighborsmethod.this._queue = new FlatQueue(); } -
Flatbush.Add
Add a given rectangle to the index.
add(minX, minY, maxX, maxY) {
-
We need to know the insertion index of the box presently being added.
In the constructor,
this._posis initialized to0and in each call toadd(),this._posis incremented by4. Dividingthis._posby4retrieves the 0-based index of the box about to be inserted.This bit shift:
this._pos >> 2is equivalent to
this._pos / 4but the bit shift is faster because it informs the JS engine that we expect the output to be an integer.
Because there are 4 values for each item, using
_posis an easy way to infer the insertion index without having to maintain a separate counter.const index = this._pos >> 2; const boxes = this._boxes;
-
We set the value of
this._indicesat the current index’s position to the value of the current index. Sothis._indicesstores the insertion index of each box.Later, inside the
finishmethod, we’ll sort the boxes by their hilbert value and jointly reorder the values in_indices, ensuring that we keep the indices and boxes in sync.This means that for any box representing a leaf node at position
i(whereipoints to a box not a coordinate inside a box),this._indices[i]retrieves the original insertion-order index of that box.this._indices[index] = index;
-
We set the coordinates of this box into the
boxesarray. Note thatthis._pos++is evaluated after the box index is set. Soboxes[this._pos++] = minX;is equivalent to
boxes[this._pos] = minX; this._pos += 1;boxes[this._pos++] = minX; boxes[this._pos++] = minY; boxes[this._pos++] = maxX; boxes[this._pos++] = maxY;
-
Update the total bounds of this instance if this rectangle is larger than the existing bounds.
if (minX < this.minX) this.minX = minX; if (minY < this.minY) this.minY = minY; if (maxX > this.maxX) this.maxX = maxX; if (maxY > this.maxY) this.maxY = maxY; return index; }
-
Flatbush.finish
A spatial index needs to sort input data so that elements can be found quickly later.
The simplest way of sorting values is on a single dimension, where if
ais less thanb,ashould be placed beforeb. But that presents a problem because we have two dimensions, not one.One way to solve this is to map values from two-dimensional space into a one-dimensional range. A common way to perform this mapping is by using space-filling curves. In our case, we’ll use a hilbert curve, a specific type of space-filling curve that’s useful with geospatial data because it generally preserves locality.
First six iterations of the Hilbert curve, from Wikipedia, CC BY-SA.
Note that using a space-filling curve to map values into one dimension isn’t the only way of sorting multi-dimensional data. There are other algorithms, like sort-tile-recursive (STR) that first sort into groups on one dimension, then the other, recursively.
While this canonical Flatbush implementation chooses to sort based on hilbert value, that’s actually not necessary to maintain ABI-stability: any two-dimensional sort will work. My Rust port defines an extensible trait for sorting and provides both hilbert and STR sorting implementations.
-
Recall that in the
addmethod, we incrementthis._posby1for each coordinate of each box. Here we validate that we’ve added the same number of boxes as we provisioned in the constructor. Remember that>> 2is equivalent to/ 4.if (this._pos >> 2 !== this.numItems) { throw new Error( `Added ${this._pos >> 2} items when expected ${this.numItems}.` ); } const boxes = this._boxes;
-
If the total number of items in the tree is less than the node size, that means we’ll only have a single non-leaf node in the tree. In that case, we don’t even need to sort by hilbert value. We can just assign the total bounds of the tree to the following box and return.
if (this.numItems <= this.nodeSize) { boxes[this._pos++] = this.minX; boxes[this._pos++] = this.minY; boxes[this._pos++] = this.maxX; boxes[this._pos++] = this.maxY; return; }
-
Using the total bounds of the tree, we compute the height and width of the hilbert space and instantiate space for the hilbert values.
const width = this.maxX - this.minX || 1; const height = this.maxY - this.minY || 1; const hilbertValues = new Uint32Array(this.numItems); const hilbertMax = (1 << 16) - 1;
-
Map box centers into Hilbert coordinate space and calculate Hilbert values using the
hilbertfunction defined below.This for loop iterates over every box. At the beginning of each loop iteration,
posis equal toi * 4.for (let i = 0, pos = 0; i < this.numItems; i++) { const minX = boxes[pos++]; const minY = boxes[pos++]; const maxX = boxes[pos++]; const maxY = boxes[pos++]; const x = Math.floor( (hilbertMax * ((minX + maxX) / 2 - this.minX)) / width ); const y = Math.floor( (hilbertMax * ((minY + maxY) / 2 - this.minY)) / height ); hilbertValues[i] = hilbert(x, y); }
-
Up until this point, the values in
boxesand inthis._indicesare still in insertion order. We now jointly sort the boxes and indices according to their hilbert values.sort( hilbertValues, boxes, this._indices, 0, this.numItems - 1, this.nodeSize );
-
Now the leaves of the tree have been sorted, but we still need to construct the rest of the tree.
For each level of the tree, we need to generate parent nodes that contain
nodeSizechild nodes. We do this starting from the leaves, working from the bottom up.Here the iteration variable,
i, refers to the positional tree level, which is also an index into thethis._levelBoundsarray.- When
i == 0, we’re iterating over the original geometry boxes. - When
i == 1, we’re iterating over the parent nodes one level up that we previously generated from the first loop iteration. - And so on,
irepresents the number of parents from the original geometry boxes.
As elsewhere,
posis a local variable that points to a coordinate within a box at the given leveliof the tree. Note this syntax: it’s unusual for two variables to be defined in theforloop binding: here bothiandposare only defined within the scope of this loop. But onlyiis incremented by the loop.posis incremented separately within the body of the loop (four times for each box).for (let i = 0, pos = 0; i < this._levelBounds.length - 1; i++) {
- When
-
Next, we want to scan through all nodes at this level of the tree, generating a parent node for each group of consecutive
nodeSizeboxes.Here,
endis the index of the first coordinate at the next level above the current level. So the range up toendincludes all coordinates at the current tree level.We then scan over all of these box coordinates in this while loop.
const end = this._levelBounds[i]; while (pos < end) {
-
We record the
pospointing to the first element of the first box in each group of consecutivenodeSizeboxes, in order to later record it in theindicesarray. -
Calculate the bounding box for the new parent node.
We initialize the bounding box to the first box and then expand the box while looping over the rest of the elements that together are the children of this parent node we’re creating.
Note the
j = 1in the loop; this is a small optimization because we initialize thenode*variables to the first element, rather than initializing with positive and negative infinity.Also note that in the loop we constrain the iteration variable
jto be both less than the node size and forpos < end. The former ensures we have only a maximum ofnodeSizeelements informing the parent node’s boundary. The latter ensures that we don’t accidentally overflow the current tree level.let nodeMinX = boxes[pos++]; let nodeMinY = boxes[pos++]; let nodeMaxX = boxes[pos++]; let nodeMaxY = boxes[pos++]; for (let j = 1; j < this.nodeSize && pos < end; j++) { nodeMinX = Math.min(nodeMinX, boxes[pos++]); nodeMinY = Math.min(nodeMinY, boxes[pos++]); nodeMaxX = Math.max(nodeMaxX, boxes[pos++]); nodeMaxY = Math.max(nodeMaxY, boxes[pos++]); }
-
Now that we know the extent of the parent node, we can add the new node’s information to the tree data.
Recall that
nodeIndex, stored above, points to the first element of the first box in each group of consecutivenodeSizenodes.The
nodeIndexis always a multiple of 4 because there are 4 coordinates in each 2D box. This means we can divide by 4 to store the node index information more compactly. Again, we use>> 2instead of/ 4as a performance optimization.When we’re at the base (leaf) level of the tree,
nodeIndexrepresents the insertion index of the first box in this group.Similarly, when we’re at higher levels of the tree,
nodeIndexrepresents the offset of the first box in this group.These two facts allow us to traverse the tree in a search query, as we’ll see below in
Flatbush.search.Note that we’re setting the parent node into
this._indicesandboxesaccording tothis._pos, which is a different variable than the localposvariable that’s incremented in this loop.this._posis a global counter that keeps track of the new nodes we’re inserting into the index. In contrast,posis a local counter for aggregating the information for the parent node.Impressively, these loops do all the hard work of constructing the tree! That’s it! The structure of the tree and the coordinates of all the parent nodes are now fully contained within
this._indicesandboxes, which are both views onthis.data!this._indices[this._pos >> 2] = nodeIndex; boxes[this._pos++] = nodeMinX; boxes[this._pos++] = nodeMinY; boxes[this._pos++] = nodeMaxX; boxes[this._pos++] = nodeMaxY; } } }
-
Flatbush.search
The primary API for searching an index by a bounding box query.
search(minX, minY, maxX, maxY, filterFn) {
-
A simple check to ensure that this index has been finished/sorted.
if (this._pos !== this._boxes.length) { throw new Error("Data not yet indexed - call index.finish()."); }
-
nodeIndexis initialized to the root node, the parent of all other nodes. Since the tree is laid out from bottom to top, the root node is the last node inthis._boxes. We subtract4so thatnodeIndexpoints to the first coordinate of the box.Note that
nodeIndexwill always point to the first box within a group of (usuallynodeSize) boxes.queueholds integers that represent the position withinthis._indicesof intermediate nodes that still need to be searched. That is,queuerepresents nodes whose parents intersected the search predicate.resultsholds integers that represent the insertion indexes that match the search predicate.let nodeIndex = this._boxes.length - 4; const queue = []; const results = []; -
Now we have our search loop.
while (nodeIndex !== undefined)will be
trueas long as there are still elements remaining inqueue(note that the last line of thewhileloop isnodeIndex = queue.pop();).while (nodeIndex !== undefined) {
-
Find the end index of the current node.
Most of the time, the node contains
nodeSizeelements. At the end of each level, the node will contain fewer elements. In the first case, the end of the node will be the current index plus 4 coordinates for each box. We check if we’re in the second case by checking the value ofthis._levelBoundsfor the current level of the tree.const end = Math.min( nodeIndex + this.nodeSize * 4, upperBound(nodeIndex, this._levelBounds) );
-
Then we search through each box of the current node, checking whether each matches our predicate. The loop ranges from the first node of the level (
nodeIndex) to the last (end). We incrementposby4for each loop step because there are 4 coordinates.for (let pos = nodeIndex; pos < end; pos += 4) {
-
Check if the current box does not intersect with query box. If the current box does not intersect, then we can continue on to the next element of this node.
If we reach past these four lines, then we know the current box does intersect with the query box.
if (maxX < this._boxes[pos]) continue; if (maxY < this._boxes[pos + 1]) continue; if (minX > this._boxes[pos + 2]) continue; if (minY > this._boxes[pos + 3]) continue;
-
posis a pointer to the first coordinate of the given box. Recall inFlatbush.finishthat we set:this._indices[this._pos >> 2] = nodeIndex;This stored a mapping from parent to child node, where
this._pos >> 2was the parent node andnodeIndexwas the child node. Now is the time when we want to use this mapping.- If the current box is not a leaf,
indexis theposof the first box of the child node. This child is a node that we should evaluate later, so we add it to thequeuearray. - If the current box is a leaf, then
indexis the original insertion index, and we add it to theresultsarray.
Again,
pos >> 2is a faster way of expressingpos / 4, where we can inform the JS engine that the output will be an integer.I believe
| 0is just a JS engine optimization that doesn’t affect the output of the operation?Then we can add the
indexto either the intermediatequeueorresultsarrays as necessary.const index = this._indices[pos >> 2] | 0; if (nodeIndex >= this.numItems * 4) { queue.push(index); } else if (filterFn === undefined || filterFn(index)) { results.push(index); } }
- If the current box is not a leaf,
-
Set the
nodeIndexto the next item in thequeueso that we continue thewhileloop.nodeIndex = queue.pop(); } return results; }
-
Flatbush.neighbors
The primary API for searching an index by nearest neighbors to a point.
This has significant overlap with
Flatbush.search, and so we’ll only touch on the differences.neighbors(x, y, maxResults = Infinity, maxDistance = Infinity, filterFn) { if (this._pos !== this._boxes.length) { throw new Error("Data not yet indexed - call index.finish()."); }
-
Instead of using an array as a queue, here we use a priority queue. This is a data structure that maintains the queue in sorted order, and which allows us to ensure that the first element of the queue is indeed the closest to the provided point.
let nodeIndex = this._boxes.length - 4; const q = this._queue; const results = []; const maxDistSquared = maxDistance * maxDistance; outer: while (nodeIndex !== undefined) { const end = Math.min( nodeIndex + this.nodeSize * 4, upperBound(nodeIndex, this._levelBounds) ); -
Add child nodes to the queue.
dxanddyare computed as the one-dimensional change inxandyneeded to reach one of the sides of the box from the query point. Thendistis the squared distance to reach the corner of the box closest to the query point.If this distance is less than the provided maximum distance, we add it to the queue. Since we add both intermediate nodes and results to the same queue, we need a way to distinguish the two. When the
indexrepresents an intermediate node, we multiply by two (i.e.<< 1) so that we have an even id. When theindexrepresents a leaf item, we multiply by two and then add one (i.e.(<< 1) + 1), so that we have an odd id.for (let pos = nodeIndex; pos < end; pos += 4) { const index = this._indices[pos >> 2] | 0; const dx = axisDist(x, this._boxes[pos], this._boxes[pos + 2]); const dy = axisDist(y, this._boxes[pos + 1], this._boxes[pos + 3]); const dist = dx * dx + dy * dy; if (dist > maxDistSquared) continue; if (nodeIndex >= this.numItems * 4) { q.push(index << 1, dist); } else if (filterFn === undefined || filterFn(index)) { q.push((index << 1) + 1, dist); } }
-
Now that we’ve added all child nodes to the queue, we can move queue items to the results array and/or break out of the outer loop completely.
Since this queue is a priority queue, we can be assured that the first item of the queue is the closest to the query point. The nearest corner of the box of that item is closer than any other node or result.
While the
queueis non-empty and the first (closest) item in the queue is a leaf item (odd), if that item’s distance is more than the maximum query distance, we can break out of the outer loop, since there cannot be any more nodes that are closer than that distance. If the item’s distance is less than the maximum query distance, we add it to the results array because it must be the next closest result.If the first (closest) item of the
queueis an intermediate node (not odd), then we need to evaluate the items of that node before knowing which one is the next closest. In this case, thewhilecondition isfalse, and we set thenodeIndexto that intermediate node for the next iteration of the outerwhileloop.while (q.length && q.peek() & 1) { const dist = q.peekValue(); if (dist > maxDistSquared) break outer; results.push(q.pop() >> 1); if (results.length === maxResults) break outer; } nodeIndex = q.length ? q.pop() >> 1 : undefined; }
-
We clear the queue because this queue is reused for all queries in this index.
q.clear(); return results; } }
-
The remaining code is “just” utility functions.
I won’t document these in detail because they tend to be self explanatory or easily found online and this post is focused more on the RTree implementation itself.
axisDist: 1D distance from a value to a range.function axisDist(k, min, max) { return k < min ? min - k : k <= max ? 0 : k - max; }
-
upperBound: Binary search for the first value in the array bigger than the given.function upperBound(value, arr) { let i = 0; let j = arr.length - 1; while (i < j) { const m = (i + j) >> 1; if (arr[m] > value) { j = m; } else { i = m + 1; } } return arr[i]; }
-
sort: Custom quicksort that partially sorts bbox data alongside the hilbert values.function sort(values, boxes, indices, left, right, nodeSize) { if (Math.floor(left / nodeSize) >= Math.floor(right / nodeSize)) return; const pivot = values[(left + right) >> 1]; let i = left - 1; let j = right + 1; while (true) { do i++; while (values[i] < pivot); do j--; while (values[j] > pivot); if (i >= j) break; swap(values, boxes, indices, i, j); } sort(values, boxes, indices, left, j, nodeSize); sort(values, boxes, indices, j + 1, right, nodeSize); }
-
swap: Swap two values and two corresponding boxes.function swap(values, boxes, indices, i, j) { const temp = values[i]; values[i] = values[j]; values[j] = temp; const k = 4 * i; const m = 4 * j; const a = boxes[k]; const b = boxes[k + 1]; const c = boxes[k + 2]; const d = boxes[k + 3]; boxes[k] = boxes[m]; boxes[k + 1] = boxes[m + 1]; boxes[k + 2] = boxes[m + 2]; boxes[k + 3] = boxes[m + 3]; boxes[m] = a; boxes[m + 1] = b; boxes[m + 2] = c; boxes[m + 3] = d; const e = indices[i]; indices[i] = indices[j]; indices[j] = e; }
-
hilbert: compute hilbert codes.This is the function that takes a position in 2D space,
xandy, and returns the hilbert value for that position.Umm yeah sorry I can’t say anything else about this… it’s black magic.
Refer to the C++ source and the original blog post for any hope of understanding what’s going on here!
function hilbert(x, y) { let a = x ^ y; let b = 0xffff ^ a; let c = 0xffff ^ (x | y); let d = x & (y ^ 0xffff); let A = a | (b >> 1); let B = (a >> 1) ^ a; let C = (c >> 1) ^ (b & (d >> 1)) ^ c; let D = (a & (c >> 1)) ^ (d >> 1) ^ d; a = A; b = B; c = C; d = D; A = (a & (a >> 2)) ^ (b & (b >> 2)); B = (a & (b >> 2)) ^ (b & ((a ^ b) >> 2)); C ^= (a & (c >> 2)) ^ (b & (d >> 2)); D ^= (b & (c >> 2)) ^ ((a ^ b) & (d >> 2)); a = A; b = B; c = C; d = D; A = (a & (a >> 4)) ^ (b & (b >> 4)); B = (a & (b >> 4)) ^ (b & ((a ^ b) >> 4)); C ^= (a & (c >> 4)) ^ (b & (d >> 4)); D ^= (b & (c >> 4)) ^ ((a ^ b) & (d >> 4)); a = A; b = B; c = C; d = D; C ^= (a & (c >> 8)) ^ (b & (d >> 8)); D ^= (b & (c >> 8)) ^ ((a ^ b) & (d >> 8)); a = C ^ (C >> 1); b = D ^ (D >> 1); let i0 = x ^ y; let i1 = b | (0xffff ^ (i0 | a)); i0 = (i0 | (i0 << 8)) & 0x00ff00ff; i0 = (i0 | (i0 << 4)) & 0x0f0f0f0f; i0 = (i0 | (i0 << 2)) & 0x33333333; i0 = (i0 | (i0 << 1)) & 0x55555555; i1 = (i1 | (i1 << 8)) & 0x00ff00ff; i1 = (i1 | (i1 << 4)) & 0x0f0f0f0f; i1 = (i1 | (i1 << 2)) & 0x33333333; i1 = (i1 | (i1 << 1)) & 0x55555555; return ((i1 << 1) | i0) >>> 0; }