After writing a trie I wanted to better understand its performance, so I wrote some benchmarks against various other Go implementations for storing UK postcodes.

At some point since the new year I entirely replaced the implementation from my last post with one that more closely matches the “pure” trie described at the start of TAOCP 6.3; i.e. a table of nodes, consisting of a list of entries, where each node entry can be either a link to another node, or a key (that is, an entire string stored in the trie).

Perhaps my chosen use case is unusual (postcodes take up less than a 64-bit pointer!) but I was pleased with the performance; admittedly the current implementation achieves this by reducing the alphabet size, but I think this could generalize fairly easy with the alphabet passed in at Trie construction time. There is room for optimization of the entry lookup, which currently just indexes into the alphabet string; a hash function would be faster, but I am unsure whether I want to think about hash collisions.

The biggest weakness of these benchmarks so far is that they do not look at prefix search - only testing whether a string exists in the trie. I did benchmark my trie against a plain Go map, and the trie manages to achieve faster access if you are looking for keys in insertion order, which I was quite impressed with - I believe this will be because the right nodes will be in the L1/L2 caches, whereas a map cannot make use of this. Random access is of course much faster with a map.

I enjoy this type of algorithm exploration, though I don’t have a serious use for a trie in the short term. Maybe something will come up!