Settings

Theme

A look at search engines with their own indexes (2021)

seirdy.one

77 points by mnem 2 years ago · 27 comments

Reader

mrweasel 2 years ago

I have been somewhat impressed by Mojeek, but it does have two obvious flaws:

1) It not really good for localized search, it might be if you're local to the US or UK.

2) No !bangs. Coming from Ecosia I frequently just do !w !maps !yt because I know where I want the answer to come from

For English language searches, it completely usable, but not quite as good as Bing or Google. I really wanted to try to use Mojeek as my default for an extended period of time, but the lack of good local search makes it a bit annoying.

  • marginalia_nu 2 years ago

    Local search and location-aware search is probably Google's biggest moat against smaller search engines. Bing does it passably, but it's aguably still pretty bad.

    What's worse is that it's probably hard to ever get working well without the internet-scale profiling Google has access to.

    • reddalo 2 years ago

      > Local search and location-aware search is probably Google's biggest moat

      The European Union, at least, has limited that a bit by preventing Google from linking Google Maps from their SERP.

      So now, if you're in the EU, local results will display a map but you can't click on it.

    • mrweasel 2 years ago

      They might be doing it differently, but Ecosia uses Bing and have really good localized search, at least for Denmark. There is very little difference between Google and Bing these days, if anything I'd say Bing is the better search engine.

  • ldng 2 years ago

    Perfect, localized search is the most annoying thing there is.

dang 2 years ago

Related:

A look at search engines with their own indexes (2021) - https://news.ycombinator.com/item?id=31820149 - June 2022 (114 comments)

instagib 2 years ago

Looking at all top 3 is helpful. I have done a lot of part sourcing for engineering work and begun using searx because of the aggregation. There are some other tools to use also when searching for obscure out of stock supply chain induced woes.

Waterluvian 2 years ago

Is there some 80/20 rule for web indexing?

I’m not saying having deep per-page indexing of Reddit, for example, isn’t useful. But is there any value in a breadth-focused index that is far cheaper to maintain?

  • marginalia_nu 2 years ago

    Almost certainly. Internet search is above all a problem of improving the signal to noise ratio.

    There's an inordinate amount of documents that will never be a good search result for any query. Both in trivial cases that have barely anything to index in them, but also sign-up forms, cookie policies, redundant information (e.g. any given man page exists in dozens if not hundreds of identical copies on the web).

    • reddalo 2 years ago

      > cookie policies

      Unless you're specifically searching for other websites' cookie policies (e.g. to understand how they work, or to do research on them, or just to plainly copy them...)

serafettin 2 years ago

https://index.network for composable, user-owned semantic indexes. Disclaimer: I work there.

wakawaka28 2 years ago

Can we get a list for 2024?

jeffreyw128 2 years ago

Missed exa.ai! Embeddings-based search engine with its own index

  • HeatrayEnjoyer 2 years ago

    How does an embeddings based search work? Without hallucinating bad links?

    • janalsncm 2 years ago

      Not sure what they are doing but embeddings and hallucination are completely separable imo (you can have hallucination even without embedding-based retrieval). Likely you have an embedding for the query which is close to the embedding of the doc for some measure of similarity. That could be semantic similarity or even user behavior.

    • cyanydeez 2 years ago

      Embeddings arnt grnerative AI.

      Theyre just vecotors of arbitrary.dimension and similarity is calculated by a ndimensional fnction.

raytopia 2 years ago

A little tangential but does anyone know if there are any modern web directories?

I'm wondering because it seems like due to the amount of spam on the web there needs to be more human curation as opposed to algrothims deciding what websites are valuable or not.

danielcampos93 2 years ago

It needs updating to include you.com, perplexity, etc. Most of those are google reskins/emulators but they are there non the less

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection