Settings

Theme

Ask HN: Could we just re-invent original Google?

24 points by __jambo 2 years ago · 39 comments · 1 min read


I feel like google enshittified itself with SEO and bloat. Maybe if we just re-made the original and changed the business model it would be good again.

28304283409234 2 years ago

You mean, like https://www.kagi.com? A search engine you pay for. Every link is there to serve _you_. There are not ads.

  • 2OEH8eoCRo0 2 years ago

    I was fine with the simple ads in the sidebar of the Google results of yore. It felt like a fair deal.

    • __jamboOP 2 years ago

      I think the problem is then you eventually get SEOification. I suppose you can just reset every now and again. Maybe like give a training program to get SEO people real jobs.

  • wildrhythms 2 years ago

    I haven't tried Kagi yet but I'm curious how does Kagi combat SEO spammers?

    • drcongo 2 years ago

      Extremely well! No idea how it's doing it in the general results, but they're orders of magnitude better than Google / Bing etc. Plus you get to fine tune results yourself, ie: downrank w3schools etc so the garbage sites are less likely to appear in your results. You can even exclude them completely.

      • palavrov 2 years ago

        A good strategy could be to rank on the number of ads ... more ads in the page, lower the rank.

        • danbulant 2 years ago

          They sort of do that. Next to link, you have information about how many ads were detected in the site, and sites with more ads are lower ranked. They also have an index of their own which only allows sites with few ad detections (they use number of blocked things by ublock, so it also counts tracking links etc)

    • beej71 2 years ago

      Go try it. You get 100 free searches per month, or something like that.

      I'm a happy customer.

    • dinkleberg 2 years ago

      Pretty well to be honest.

jdietrich 2 years ago

The problem isn't Google, it's the internet.

Google might be deliberately making their search results "worse" (for whatever value of "worse" you prefer), but nobody else is doing all that much better, despite the obvious motivation to do so. I'm a DuckDuckGo user, but I don't think that the results I get are particularly better than Google results.

When people complain about Google, the root cause is usually that the thing they're looking for doesn't exist anywhere on the internet. Spam publishing sometimes drowns out the thing you're looking for, but often that's illusory - if you clicked through every single results page, you still wouldn't find what you're looking for, because it isn't there. A search engine can boost the signal-to-noise ratio in the results that it gives users, but it can't generate signal where none exists. Fixing that problem is altogether more difficult.

  • DamnInteresting 2 years ago

    > The problem isn't Google, it's the internet.

    I partially agree, but Google itself has also been going downhill in my experience. For example, in recent months the topmost Google results often omit one or more of my search terms, and I have to search again with quotes to force words to be included. I understand the reasoning behind including results that omit a term, but putting those at the top is just silly. Yesterday I did a search that included "discontinued", and the top result along with most of the first page of results ignored "discontinued"[1], so the results were mostly the opposite of what I was searching for.

    1: With the text below the result "Missing: discontinued ‎| Show results with: discontinued"

tlb 2 years ago

The real prize is to deshittify the content on the web. Ignoring copyrights for a moment, it should be possible to rewrite pages in 1000s of sites down to just the information. StackOverflow could just have the Q&As, news sites could just have the news, Pinterest would just have the pictures. Remove all the signup nags and popups.

With good tools, one person could probably maintain the deshittifier for a few sites, at least until the sites started getting adversarial about it.

  • pavlov 2 years ago

    ChatGPT is basically the deshittifier that rewrites thousands of sites to just the information…?

    And it has a much better interface than original Google. Every query starts with the equivalent of “I’m feeling lucky”, but then you can ask further questions.

  • is_true 2 years ago

    SO is one of the examples where there isn't that much bloat, usually the longest answers are more detailed and let you learn, instead of just copy pasting (potentially) working code.

    In the web Google is in part responsible for adding more content, they always suggested longer articles because it helped the algo get a better context. But for us humans it mostly doesn't make sense.

    The same happens with books but I think we are at fault too, I would think twice about a book that is 80 pages, but the truth is that 80 pages could be a lot for most topics. I believe that the summarization capabilities of LLMs are gonna make a generation feel different about short content.

  • M95D 2 years ago

    Nobody will create that. Too complicated to create. Too complicated to maintain.

    I wish there was a browser add-on that could regex-replace html source of the page. Then I could write my own deshittfier list. Since I couldn't find any, i'm guessing it's because the plugin APIs won't allow it.

    There was a MITM app that could do this, AdMuncher, but then the web switched to HTTPS and it didn't work anymore.

  • kesor 2 years ago

    Reminds me of the word of the year https://en.wikipedia.org/wiki/Enshittification

O1111OOO 2 years ago

There was a search engine listed here yesterday (https://news.ycombinator.com/item?id=39203538) that displayed Google searches with the oldest first.

I played around with it, mostly typing in a bunch of DOS search terms. If this search engine is working the way it's suppose to.. I should have thousands of very old results. After a single page or 3, I was quickly looking at results from 2006/2007.

This might have been a problem with the search engine or a (more serious) problem with Google throwing away much of the past. We already know this is a problem. We just don't know how serious it is.

A recreation of Google would involve reindexing to include what Google (DDG/Bing and others) have abandoned (the recent past).

PS: I use https://wiby.me and https://yandex.com with much greater success in finding older material.

dave4420 2 years ago

SEO is something that happened to Google, any search engine will have to deal with it.

What changes to the business model are you envisaging?

  • beej71 2 years ago

    In terms of enshittification, a good model would be downranking by number of ads and tracking. That applies some counterpressure.

fsflover 2 years ago

Here you go: distributed, peer-to-peer, FLOSS search engine: https://yacy.net

  • viraptor 2 years ago

    I love the idea itself, but the project is dying. Almost no maintenance, the P2P part is spammed / unusable, CloudFlare bot checks wreck your IP reputation, the deployment/setup is... idiosyncratic. I wish it was still practical to use YaCy.

jqpabc123 2 years ago

Maybe if we just re-made the original and changed the business model it would be good again.

This has already been done --- see DuckDuckGo.

The problem is not ads --- it's privacy invasion (aka "personalized ads") and advertisers who respond to concept.

DuckDuckGo shows ads --- "context sensitive" ads --- you know, ads that are related to what you search for and might actually be helpful. Not something you did last week or last month that may no longer apply.

"Personalized ads" are one of the dumbest ideas ever --- virtually guaranteed to waste a lot of people's time and squander mental and electronic bandwidth. And yet Google makes billions --- because advertisers are stupid and lazy enough to literally turn over most of their ad budget to them.

We need to upend the idea the idea that, "No one ever got fired for using Google" --- and the only way to do that on a personal level is to stop using Google.

We need more ordinary people to grasp and respond to the idea that "Google" is just another word for "privacy invasion".

yodsanklai 2 years ago

Do we still need Google Search as much as in the past though? most of my search queries are targeted towards specific web sites: wikipedia, stack overflow, youtube.

And overall, I don't think I suffer too much from SEO on Google Search. On the other hand, I'm very upset with the way Youtube has gone. It's harder and harder to find quality content even though it's there. I mostly don't want to see the professional youtubers.

  • Brian_K_White 2 years ago

    Absolutely we do, of course. I do not at all want to be limited to wikipedia and reddit and stackoverflow. I want all unknown tiny random geocities and tripod.

pushcx 2 years ago

https://www.google.com/about/honestresults/

Google thought it had a business model that incentivized them to stay good. The opportunity to sell results didn't go away and eventually they took it.

kozak 2 years ago

DuckDuckGo?

Mountain_Skies 2 years ago

Google loves to put its thumb on the scale and also shape results to what its handlers feels is acceptable. Another search engine could do away with both of those. SEO however will exist regardless of it the search engine favors particular content formats or not.

BenoitP 2 years ago

TL;DR: Goodhart's law rots everything it touches

SEO is in the structure of the internet now. Original Google was great because there was no incentive yet to buy a domain and blogspam it. Google getting shitty is just a natural instance of Goodhart's law, applied to domains and content.

Now, Google originally was based on PageRank; which based itself on every domain being a unit of authority. These have been compromised and drowned by SEO, but the concept remains valid and we could choose people as units of authority. For example PageRank on scientific papers accurately reproduce Nobel Prizes attributions. A person publishing papers is a solid enough foundation for this unit of authority.

It remains to be organized though. And if we take people as units of authority, it means they'd have to 'cite' or vote for each other. This has social consequences and might not be doable. Are you ready to refuse to cite your boss when he/she ask you to do so? Maybe if the vote is secret and delayed by 5 years?

Apreche 2 years ago

Most people using machine learning to make search engines are replacing the search paradigm with a prompt + answer format.

I think there’s an easier way. Train an ML model to be able to tell apart legit web sites from garbage ones. It’ s just a binary classification. A site should either be blocked, or not.

Legit web sites being ones created by actual humans with actual content. Few to no ads. No malware, phishing, or other security threats. No content farms or SEO sites. No sites generated by other ML models. No paywalls, no pop-ups or other annoyances. Just real web sites.

You’re going to need a bunch of smart and trustworthy humans to spend hours and hours to help do this classification. But a model can help multiply the effectiveness of their efforts.

If the model works, then yes. You can make a very simple search engine. You just tell all the web crawlers to check the model, and only add sites to the index if the model says they are good web sites and not garbage sites.

  • krapp 2 years ago

    Websites that practice SEO, use paywalls, popups or "other annoyances" often also have actual content which should be relevant to a search engine. A search engine that refused to show me Wikipedia or IMDB or any mainstream news site wouldn't be useful to most people.

    Also, tools that attempt to detect ML generated content tend not to work, and will only become less effective over time, as LLMs improve.

tutfbhuf 2 years ago

I think the question boils down on how to stop enshittification. Because if we were to recreate all the enshittified platforms, what would stop them from becoming enshittified anew? We must understand the mechanism of enshittification and fundamentally break this mechanism, making it nearly impossible for a new platform to enter the enshittification cycle again.

  • __jamboOP 2 years ago

    I think this calls for a government funded national un-enshittification Laboratory to study the social dynamics of enshittification and develop hygiene practices. Work must be done quickly before enshittification research becomes enshittified!

jinushaun 2 years ago

But SEO will enshittify anything new that replaces it. Sure, Google has done some damage to search, but search quality is bad because of SEO poisoning the well. All we can hope for is old school Google for a the first few years.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection