Libsearch: Simple index-free full-text search for JavaScript

96 points by thesephist 2 years ago · 20 comments

Reader

swyx 2 years ago

this is 115 lines of TS, which is pretty lightweight, but some other impt table stakes might be missing that you should be aware of.

see all the JS alternatives with their sizes and feature sets and perf: https://github.com/leeoniya/uFuzzy (scroll to bottom)

i went down this path a few yrs ago and ended up picking ufuzzy: https://swyxkit.netlify.app/ufuzzy-search

leeoniya 2 years ago

thanks, glad you like it.
you might be interested in doing a follow-up/update to your post, to use the .search() api, which does more stuff out of the box, such as outOfOrder, quoted, and negatives
Libsearch looks similar to uFuzzy (indexless regexp builder) but more simplistic. the other one i've seen that does this is sifter.js (predates uFuzzy, but i didnt discover it until researching libs to compare afterwards)
- swyx 2 years ago
  
  your readme is the gold star in readmes, i hope you keep doing more oss work!

no_wizard 2 years ago

I never realized search like this was this easy to build in a way.

It’s a very clever RegEx engine, essentially. Though it’s unclear to me right now how will this library handles fuzzy search (e.g. Califnia instead of California) but it’s really cool to reason about its implementation

maelito 2 years ago

I'm often using Fuse in my JS apps. So many in-app searchs cannot handle typing errors, it's frustrating.

efilife 2 years ago

Fuse is terrible performance-wise. There are many better options
https://github.com/nextapps-de/flexsearch?tab=readme-ov-file...
I tried Fuse, lunr and MiniSearch, and I stayed with MiniSearch. Great flexibility and very good performance even when I am indexing almost a million documents + many useful features.
Also, I would also recommend giving FlexSearch a try, it has some fancy algorithm that supposedly makes searches extra fast https://github.com/nextapps-de/flexsearch?tab=readme-ov-file...
- leeoniya 2 years ago
  
  yes, Fuse is extremely slow (and FlexSearch is very fast after indexing is done), but i have no idea how that table presents FlexSearch as having the lowest memory footprint, maybe in its default config?
  in my testing of fuzzy/substring matching it takes quite a while to create its huge index: https://github.com/leeoniya/uFuzzy?tab=readme-ov-file#perfor...

jitl 2 years ago

With FlexSearch or lunr or similar, building an index is so fast for “thousands of items” that it’s fine to do it when the user opens a search interface and forget it once they’re done.

bhl 2 years ago

Any non in-memory search indices? Was hoping to find one that used indexeddb to implement fuzzy search. Maybe that’s just not as performant which is why I haven’t found a popular library yet.
- jitl 2 years ago
  
  You do not want to use IndexedDB, you should avoid IndexedDB if you can make live work without it.
  IndexedDB rows need to be very coarse grained, each operation with the IndexedDB API has very high overhead compared to something like walking a LSM tree in RocksDB or something. If the index does fit in memory, then the best move would be to store and load the entire index from a single or a few IndexedDB rows.
  If you need to index a lot of documents in a way that doesn't fit in memory, I recommend you consider sqlite3, stored in either OPFS or on IndexedDB, and use sqlite's FTS5 full text search stuff for the index. There are several sqlite backends available (see https://github.com/rhashimoto/wa-sqlite/tree/master/src/exam...), something like their IDB VFS will store each block in the sqlite file as a row in IndexedDB essentially batching storage of the index rows into IDB rows for you (for more details read https://github.com/rhashimoto/wa-sqlite/blob/3c202615ed6f54e...)
- olivernn 2 years ago
  
  Many years ago, when I was building Lunr, it was initially based on IndexedDB, but it was _much_ slower and the datasets I had in mind easily fit in memory so I didn’t pursue it. No idea if that has changed since though.
andai 2 years ago

For thousands of items, do you even need one?
- jitl 2 years ago
  
  For typo resistance and stemming it’s useful. Mostly it depends if your search matching logic needs to do allocation per document, like if you need to concatenate 10 strings together to get the complete search text and then split that into some kind of stem array, doing that O(thousands) of times per keystroke can be laggy in JS especially on memory constrained devices because GC pressure. Better to do it once up front. Then voila, you’ve got an index.
leeoniya 2 years ago

depending on the options you need (like typo tolerance), building the index can be quite slow and use a lot of memory

gabrieledarrigo 2 years ago

Small and simple! I love these types of programs.

Alifatisk 2 years ago

What's the benefit of index-free?

leeoniya 2 years ago

much lower memory use, and instant startup time
- Alifatisk 2 years ago
  
  Oh

revskill 2 years ago

Look like grep.

Settings

Libsearch: Simple index-free full-text search for JavaScript

Keyboard Shortcuts