Settings

Theme

Fuzzify: A tiny library for fuzzy search

github.com

23 points by rumblertumbler a year ago · 12 comments

Reader

wood_spirit a year ago

iirc the go-to data structure for this kind of problem is called a “vantage point tree” (vp tree).

Another approach that I haven’t explored is to prep the array of words to search into a tree. normal words in Latin languages have lots of common prefixes and suffixes so you can dramatically reduce the amount of nodes (see my own old blog which compresses a scrabble dictionary https://williame.github.io/post/87682811573.html - same prefix suffix sharing our work on non-scrabble rotations too). Now walk the tree doing Levenshtein but checking multiple words at once?

  • jbaiter a year ago

    Not for strings, no. The default algorithm would be Levenshtein edit distance and friends (i.e. what this library uses). If you want to get fancy, you could even go for something like Levenshtein automata (which are used by Lucene to implement fuzzy search across terms).

rgbjoy a year ago

I’m a bit confused. “Unit” brings up Turkmenistan before United States.

  • Arch485 a year ago

    Turkmenistan is 8 edits, United States is 9. Levenshtein Distance is not actually a good search algorithm :)

  • card_zero a year ago

    That's correct, yes. Turkmenistan: 8 deletions (Trkmesan). United States: 9 deletions (ed States), including the space character.

  • d0100 a year ago

    You'll have to add more weights to substring matches, fuzzy search by itself is usually not enough for intuitive search

mightyham a year ago

There are tons of js fuzzy search libraries with basically the same functionality. Is there any reason I should use this over others?

finger a year ago

Regarding the name, doesn’t it implify fuzzification of data?

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection