Today we are going to explore the world of generational indices datastructures. We will also see how to implement such datastructure in Rust.
This post will use Rust for showing code, but concepts can be applied to other programming languages. The need for generational indices based datastrcture arise mostly in languages that must handle memory management such as Rust, C, C++…
Inspired by Catherine West’s closing keynote at RustConf 2018, presented in the context of an Entity-Component-System for games programming.
The issue
Let’s say that we want to modelise a graph. A graph is a self-refential structure such as A <- knows -> B
: A
need keep something in it’s memory representation to find B
. And B
need to keep something to find A
. One way to solve this issue is using indices.
The main issue of using indices is when you need to remove graph vertices, such as A
or B
, you must remove the vertex from it’s container and you remove it’s index for every other vertices. Then you either let created holes in the vertices container or fill them which lead to a huge perfomance cost.
If you are clever you may wan to use an array not directly over vertices but over vertices or nothing: Vec<Option<Vertex>>
in Rust. When deleting this will lead to the ABA problem. The ABA problem can be found when applying the following set of operations:
A
referencesB
at index 1- Someone delete
B
setting the value at index 1 toNone
(no “value” in Rust language) - Someone else create the vertex
C
at the first available index 1 because the element there isNone
A
now incorrectly referencesC
when using a get operation, when instead it should fail
The solution and how it works
The solution to this issue is generational indices. The idea behind generational indices is to attach a generation to each indices and value in the array. This generation is a monotonically increasing counter. Let’s call the union of index and a generation a key. When we want to get the element of our collection we will supply our key. When removing elements from the collection we increase the generation attached to that element. When we get elements we check the generation of the provided index to the generation attached with the value if the generation does not match it means that the item inside has changed. This solution solve the ABA problem. Let’s dive into the details of one implementation of generational indices on an array. The following figure show the idea:
As we can see from the previous figures, once a key get created it will always point to some index is in the array avoiding errors. If the generation do not match the value will be considered removed. Else, the value is returned.
My toy GenVec
Let’s review the following toy implementation of general indices on an array in Rust.
The most important thing that we can see in this implementation not shown in the figures above is the need of free_head
variable. This structure membere of GenVec
let us know the first free index of the array. Then each Free
enum variant point to the next free entry.
Also note that we used an u32
for the generation. Meaning that it may panic if we remove and reuse -a lot- the same key. But, this should note be a concern for most user usage.
Available crates
The best Rust crate for an already implemented collection using generational indices is slotmap : which provide multiple implementations as well as cool features such as secondary map and custom keys. Caution, this crate is some unsafe code.
If you don’t need to go fancy generational-arena is a simpler but safe implementation.