Settings

Theme

Visualizing Ukkonen's Suffix Tree Algorithm

abahgat.com

41 points by gsky 2 months ago · 8 comments

Reader

esafak 2 months ago

https://gemini.google/overview/canvas/ is great at these visualizations. I used it recently to help me understand chord progressions in specific songs, and compare them to those of other songs.

aserafini 2 months ago

I made an “offline” version of this visualisation a long time ago: https://github.com/adamserafini/suffix-tree

C++ with output rendered with Graphviz

sillywabbit 2 months ago

In the event someone is encountering suffix trees for the first time and thinking of using one: the amount of RAM required for suffix trees is obscene.

  • abahgat 2 months ago

    I totally fell for the "obscene memory" trap myself. My first encounter with suffix trees outside of a textbook was for an ITA Software 'Instant Search' puzzle. The requirement was sub-0.1ms search on a large string database, I went straight for a generalized suffix tree. Then I realized they had asked for the solution to fit within a 1GB heap. :(

    I wrote up the full 'war story' of how I had to profile the heap and optimize the node representation (shaving bytes off the edge storage) just to get it to boot without an OOM error: https://www.abahgat.com/blog/the-programming-puzzle-that-got...

    It’s the most tangible example I've run into of where theoretical O(n) space complexity meets the reality of object pointer overhead.

  • xen0 2 months ago

    I've never grokkeed suffix trees, but isn't possible for them to be O(n) in space (n total length of all strings)? Is there just an unacceptable constant factor overhead? I can imagine the pointer overhead being painful.

  • vintermann 2 months ago

    Yes, we get the same speed from suffix arrays these days, and much, much less memory usage.

    But good luck visualizing what those algorithms do :)

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection