Settings

Theme

Anonymous Source Shared Leaked Google Search API Documents

sparktoro.com

333 points by andrewfong 2 years ago · 311 comments

Reader

precompute 2 years ago

This just proves all the "suspicions" privacy-conscious users have had about large corporations fingerprinting users, often in very obvious ways. There's often no better place to find ideas for surveillance than the people conscious about being surveilled.

  • p3rls 2 years ago

    Many of the SEO suspicions were confirmed too.

    I found it VERY amusing if you go to r/SEO just yesterday there were moderators and flaired users (you know, the elites of the SEO community, lol) insisting much of this was "debunked" years ago.

    They of course deleted their posts, but the threads are still up. What a den of scammers over there.

    https://www.reddit.com/r/SEO/comments/1d1eqjj/comment/l5tvfw...

    https://www.reddit.com/user/WebLinkr/

    I love how reddit is turning into the new SEO scam over night because of this stuff. Great work as always Danny Sullivan!

    • p3rls 2 years ago

      It's just endlessly fascinating to me the grift on rSEO

      How these types first gain moderator status on a few subs and then the spam begins (picture of spam https://pixeldrain.com/u/a6qUPjTq )

      I haven't been able to find a single legitimate expert in the entire sub, and I've checked about every flaired user and moderator.

      You have lots of people like the above, or https://www.reddit.com/user/jesustellezllc/ that claim to run an agency in Frenso California called Ozelot Media, but when you look him up there's nothing. When you google "SEO" + "Fresno California", Ozelot media isn't even in the top 100 results. Lol, I thought that was the job of a SEO-type? Why let that stop the grift though?

    • phone8675309 2 years ago

      SEO is vandalism and I one day hope the majority of Internet users see that

      • harry8 2 years ago

        SEO is just another form of advertising, with all the costs, benefits and externalities of any other form.

        • sanroot99 2 years ago

          I had experience working with seo specialist, most of people in seo are there because good money is paid by startups to bring there site organic traffic by whatever black hat seo method, seo stands a good marketing tool for these startups.

        • phone8675309 2 years ago

          Advertising is also vandalism

      • bobthepanda 2 years ago

        Most people are aware but are powerless to do anything about it.

      • tyingq 2 years ago

        Perhaps, though a world without SEO doesn't necessarily surface the best content either. Not everything about Google's algorithm that's subpar is because of spam or SEO.

theolivenbaum 2 years ago

Seems like a lot of it came from them inadvertently posting some internal API to GitHub: https://github.com/googleapis/elixir-google-api/commit/078b4...

  • renegade-otter 2 years ago

    I guess too many people got laid off to do the whole "three reviewers per PR" thing!

    • eru 2 years ago

      When I was at Google (about a decade ago by now), we had two reviews per PR; not three. Could you tell me more about the third review?

  • dontdoxxme 2 years ago

    And it's Apache licensed, which grants a patent license. Some of the comments refer to specific aspects of how page rank is calculated. Pagerank itself is past patent protection but I wonder if this also accidentally might grant licenses to other patents.

    • yencabulator 2 years ago

      There's still an angle where the copyright owner claims that the person who caused this to happen did not have the authority to apply the license to it.

  • ec109685 2 years ago

    Oops, someone’s script was too greedy when uploading those elixir api documents.

xnx 2 years ago

I believe these are the leaked docs: https://hexdocs.pm/google_api_content_warehouse/0.4.0/api-re...

precompute 2 years ago

> My anonymous source claimed that way back in 2005, Google wanted the full clickstream of billions of Internet users, and with Chrome, they’ve now got it. The API documents suggest Google calculates several types of metrics that can be called using Chrome views related to both individual pages and entire domains.

What answer do the engineers at google working on this have for this violation of privacy?

  • GuB-42 2 years ago

    I am not an engineer at Google but this is I would say if I was.

    We don't know who you are, you are just a number in a database, and we don't even know what number, we just get the total number of visits for each website, not who visited it. It is like counting cars on a highway, not following your car. Plus, it serves the useful purpose of providing you with better search results, the terms and conditions allow it, and it can be disabled.

    • voltaireodactyl 2 years ago

      The obvious response being that counting cars on the highway is a necessary first step on the road to identifying and then tracking their movements.

      Similar to how insurance companies have offered voluntary, “anonymized” data dongles for discounts that are now being used (or at least revealed to be used) to collect data most often used to reject claims.

      • Ferret7446 2 years ago

        Agriculture is a necessary first step toward a dystopian society, so clearly we should ban agriculture.

        The logic does not follow. "A is required for B, B is bad, so A is bad" is not logically valid.

        • voltaireodactyl 2 years ago

          I absolutely agree — my statement was merely one of fact; to get to point B, one must first achieve point A.

          Much like inventing the wheel (positive) is a necessary precursor to both ambulances/fire fighting (also positive), as well as DUIs (less positive).

          The larger point being simply that I find it somewhat disingenuous for those of us who consider it our job to think through problems from many angles to pretend we’re somehow unable to imagine the potential unfortunate consequences arising from our work, given the existing Powers That Be (meaning both entities and trends).

    • lolinder 2 years ago

      > we don't even know what number, we just get the total number of visits for each website, not who visited it

      This is not what a clickstream is. A clickstream requires that the sequence of clicks be preserved, and preserving that sequence undermines anonymity.

      • tommiegannert 2 years ago

        It can be pseudonymous. It doesn't have to undermine anonymity.

        Google researchers spend time ensuring k-anonymity (for reasonably large k) when using data.

        • lesostep 2 years ago

          The data doesn't have to be tied to me directly to affect me directly. If my clicks suggest that I'm a wealthy woman, isn't that the reason enough to try and connect me with advertisers that try to sell overpriced shoes? Let's downrank all the good deals, surely she can't be interested in 10$ sneakers. If my clicks suggest that I'm from Russia|Ukraine, isn't that the reason enough to show me one side of the news more then the other?

          In some way our interests define us more then social security number assigned.

        • BrenBarn 2 years ago

          It's doesn't matter if it's anonymous. Some users simply don't want you to keep any record of their activities.

        • mrmetanoia 2 years ago

          Forgive my language but I'd expect people here to understand that's horeshit, they absolutely have enough data and patterning to de-anonymize the data. They spent time making it look anonymous.

          • tommiegannert 2 years ago

            GP explained what a click stream is and said that preserving click sequences undermines anonymity. De-anonymization is not a conclusion you can draw from only mentioning "click stream."

            De-anonymization would require linking that click stream with something that identifies the user, rather than the click stream. Perhaps that exists, perhaps not. GP didn't provide enough material to go that far.

            Exactly because Google has these powers is why they have internal processes to avoid it. Of course there are products that use non-anonymous data, but this idea that everything at Google flows around with user-IDs for everyone to use and abuse is a weird stance. Google has a lot of internal auditing and validation systems when e.g. reading logs and doing feature extraction.

            But I also got way more respect for Google's internal systems after I worked there than before, so I understand your scepticism.

  • raxxorraxor 2 years ago

    That would be money. If someone has another excuse, they are naive or lying to themselves.

    It certainly is not "to improve the net or advertising" - that would be the lying part.

    Google has done some good for the net, but the scales of their contributions slowly but steadily move to the negative side.

    • azemetre 2 years ago

      Reminds me of the studies they’ve done on cognitive dissonance/lying.

      Basically if you believe lies you tell yourself, they tend to turn into truths in your mind over time. Even if you were doing it “ironically.”

  • danpalmer 2 years ago

    Personal (not work related opinion): This basically can’t happen with things like DMA and GDPR. DMA in particular means you can’t share data across “products” without explicit consent. So you could for example collect websites that don’t work for the purposes of improving Chrome, but not then share that with the Ads/Search orgs for personalisation or targeting, as far as I understand the legislation.

    Personal opinion about work at Google (still not googles opinion) I’m consistently impressed with how seriously this stuff is taken and the amount of work that goes into making sure that things like this sharing can’t happen accidentally, and that user choice is respected. The engineers on the ground are absolutely making sure this all works, and most of us care deeply about user privacy. I have personally worked both on implementing new features that significantly push forward privacy, and on implementing privacy controls for regulatory purposes.

    • BrenBarn 2 years ago

      The thing is that preventing "sharing" isn't sufficient. People who are concerned about privacy don't want any such data collected or stored in the first place, ever. The implicit "sharing" of my data with Google (or whatever company) is a problem in itself. Regardless of how "seriously" Google (or whatever company) takes it, for a lot of the data I don't want them to ever have it in the first place.

      • specialist 2 years ago

        Yes and:

        Require opt-in by default. In all cases.

        All PII data at rest must be encrypted at the field level. Like how passwords should be stored. aka Translucent Database techniques. Not just in transit. Not just encrypting the whole database. But encrypt the actual fields within a database.

        Constitutional privacy means personal sovereignty over oneself. (A superset of the folk definition of keeping secrets.) Meaning any and all data about me is owned by me. Any one using my data for any purpose has to pay me. (See opt-in by default above.)

      • troyvit 2 years ago

        > The thing is that preventing "sharing" isn't sufficient.

        Exactly this. It doesn't matter that google doesn't "share" what they gather if they own so many conversion funnels from top to bottom anyway.

      • danpalmer 2 years ago

        This is a fair position to take, but assuming good faith all round, one that I think will typically be a minority. If you ask a user if they're willing to share crash reports only to improve the reliability of the software, I'd bet most people would be ok with this. In fact it's sufficiently reasonable that I believe GDPR allows this to be opt-out, something I broadly agree with. I do think opt-outs should be available, I do think there should be configuration available for those who do not wish to share anything, but if the laws are being met, in the right spirit, then I would hope it would provide little actual benefit.

        • BrenBarn 2 years ago

          > assuming good faith all round

          But why would anyone assume that? I think the position of many privacy advocates is that we're long past the point where it's reasonable to assume Google is acting in good faith in the best interests of its users. (Again, to be fair, this is true of more companies than just Google.)

          • komali2 2 years ago

            Assuming good faith from a corporation is absurd.

            Corporations measure success by one metric and one metric alone: shareholder value. Under our current system, a corporation that doesn't increase shareholder value is considered a bad company. Such a company is punished.

            If Google can increase shareholder value by violating user privacy, and the consequences of getting caught won't reduce shareholder value too much, it's a bad company if it doesn't violate user privacy.

            Of course there are mechanisms that slow this down, like laws and employees trying to follow laws, employee ethics, old guard culture, etc, but all will be defeated one by one for shareholder value.

        • throwaway743 2 years ago

          > If you ask a user if they're willing to share crash reports only to improve the reliability of the software, I'd bet most people would be ok with this.

          You do realize the majority of people are completely oblivious as to why privacy matters as it relates to their data collection.

          It's not that they're willing to do anything. It's that they're passive/apathetic when faced with vague prompts telling them about a matter they don't have insight on, after being bombarded by terms of service agreements, cookie pop ups, etc for years and years.

          > This is a fair position to take, but assuming good faith all round, one that I think will typically be a minority.

          If they were aware of privacy implications / exactly what's being collected on them and how that data is being used, then it's safe to say that they'd be the majority. Can't blame them for not taking the time to read into the matter either, as most outside of tech are wrapped up with a million other hostilities in their daily lives.

          Defend it all you want, but it's just one more unethical thing screwing people over.

        • pseudalopex 2 years ago

          You know most of the data Google collect are not crash reports. Most and all are not the same. And you would get most of the data you wanted if you asked according to you.

          In fact your opinion is not a fact.

          The right spirit is informed consent.

        • thaumasiotes 2 years ago

          That analogy doesn't seem like a match. If you ask a user if they're willing to share every action they ever take "only to improve the reliability of the software", a lot of them are going to say "wait, why would you even need that?"

          If I'm letting you scrape crash dumps and my browser happens to crash in the request where I send my credit card information to xhamster, that's one thing. Odds are that's never happened to anyone. It's another thing for you to guarantee that you're planning to record that information.

    • verteu 2 years ago

      > I’m consistently impressed with how seriously this stuff is taken and the amount of work that goes into making sure that things like this sharing can’t happen accidentally

      I believe the law is violated when it's sufficiently profitable -- it just requires VP permission.

      No public sources for this except Jedi Blue, the old anti-poaching case, etc.

    • noprocrasted 2 years ago

      > This basically can’t happen with things like DMA and GDPR

      I'm sorry but this is just wishful thinking. It might be what the spirit of the DMA & GDPR want but definitely not the reality thanks to inadequate or outright non-existent enforcement.

      There are businesses out there whose entire business model and revenue stream are based on violating the GDPR. Not some kind of internal conspiracy or rogue employee, but the entire company is doing it in the open and the result of its doings (targeted ads or spam) are visible out there in the open for all to see.

      Facebook, credit bureaus, data brokers, "consent management platforms", etc. All these companies' business models are big, obvious breaches of the GDPR. Yet, they are... still alive and kicking?

      There is no chance that a concealed GDPR breach (whether intentional or accidental) will get addressed when the biggest intentional breaches are still allowed to continue out there in the open.

      I suspect something very similar is going to happen with the DMA - Apple is already acting in bad faith but have yet to see any consequences.

  • marcinzm 2 years ago

    > What answer do the engineers at google working on this have for this violation of privacy?

    The same answer you probably have for the millions of questions about what the things you do that some other people find offensive to their personal views and beliefs.

  • bdlowery 2 years ago

    How is it a violation of privacy. Did you read the terms of service?

    • precompute 2 years ago

      It's a privacy violation regardless of the ToS.

    • y42 2 years ago

      A tos announcement is not an explicit consent. I doubt that this will help in court, even pre-GDPR.

      • HelloNurse 2 years ago

        Further, a TOS announcement can be easily construed as an admission of intent to fuck users.

    • 9dev 2 years ago

      See, that’s the nice thing about the GDPR: You cannot hide unexpected hostile stuff in the ToS anymore. If you don’t tell me what you do with my data in a way that is obvious, easy to understand, and most importantly easy to disable, it’s illegal.

vouaobrasil 2 years ago

Sometimes I wonder how much better the internet would be hits on Google weren't directly tied to revenue from Google itself through its ad program. I am certain Google has made the internet and the world a worse place to live.

  • eitland 2 years ago

    As a user of Kagi and search.marginalia.nu I can tell you:

    Quite a bit.

    So much that now that I have what "everyone" asked Google for for years - that is blacklists - I hardly use them.

    Why? Because with Kagi I get much better results out of the box.

    I am fairly sure Googlers will tell me there are multiple safeguards to prevent the inclusion of Google ads from affecting ranking, to which I just have to say that the results speak for themselves.

    Please note: I have only used Kagi for two years. I am only one user. But I am a user with 20 years of experience with Google and that got to count for something.

    • Nevolihs 2 years ago

      I actually use pinning, blocking and raising/lowering the value of individual sites every day. I wish this is the direction search engines went in the first place and it's the direction I hope Kagi continues. I want a personalized search engine that's personalized by me, not by a company trying to profile me and make money off of my clicks.

      • the_snooze 2 years ago

        When each user can personalize the results themselves, you make SEO completely impractical because they can no longer target a single monolithic algorithm controled by one entity. Websites would actually have to have organic appeal to users, who get the final say to hide away bad sites from the results page (looking at you, Quora, Pinterest, and Fandom).

      • eitland 2 years ago

        I am all for Kagi keeping that feature. If for nothing else then to rub it in the face of every googler who have argued that it was impossible.

        And if you use it I am happy, that gives Kagi an incentive to keep it around.

        I'm just saying that for me the results are so good out of the box that with a couple of exceptions I never had to block anything.

    • scutrell 2 years ago

      I was excited to try Kagi, but I couldn't justify the cost. I find DDG with the occasional Google search to function almost as well. I'll try Kagi again at some point, but it wasn't the panacea people here made it out to be

    • p3rls 2 years ago

      Kagi is the same garbage as google in my niche. Even worse, maybe. It looks like it's weighing backlinks and SEO garbage even higher. Well done.

      I don't know how people keep talking about it. The results, as you say, speak for themselves.

      • eitland 2 years ago

        Well, don't use it then.

        I am happy for alternatives, otherwise I guess Kagi wouldn't improve so fast in areas I care about.

        • p3rls 2 years ago

          I won't use it. I am pushing back on your belief that it has figured something out about anything, because to me it looks like Kagi takes Google's results and then makes them worse. I have total shit ranking #1 on Kagi and Google. On other pages I am struggling to break into the top 10 where I should be right under wikipedia in a just world.

          I think many people whose lives don't revolve around these things (as in your finances/mortgage are not dependent on google search) get weird magical views about search engines where it's sorta like a modern delphic oracle where the magic works as long as you believe in it. Of course I have the opposite problem where Google is angry chthonic god to be supplicated/scorned.

          Oh well, I do know I will not be worshipping a new god that cannot even destroy the abominations sitting at the top of the rankings.

          • oooyay 2 years ago

            > I won't use it. I am pushing back on your belief that it has figured something out about anything, because to me it looks like Kagi takes Google's results and then makes them worse.

            Kagi is pretty clear about how they do what they do: https://help.kagi.com/kagi/search-details/search-sources.htm.... You're half right, except that Google is not a vertical that they leverage for their index. Their solution works for me, but I'm also a techy. Anecdotally, my partner uses it, and she is not a techy.

            > Oh well, I do know I will not be worshipping a new god that cannot even destroy the abominations sitting at the top of the rankings.

            The only way this statement applies (to me) is if you can optimize for Kagi. The only way I can think of to do that is organically, but admittedly I haven't put much effort into it because the magic of Kagi's flavor of search is that it puts me in near proximity to information density and sources that I already trust. The real proving grounds for Kagi are a decade out.

            • p3rls 2 years ago

              Lol, it's just complete bullshit marketing on that page. But I do find it fascinating how they obfuscate what is actually happening behind the scenes and cannot come out and clearly say how their technology works, or that they are using google's index.

              >Kagi is known for delivering a unique flavor of high-quality search results, sourced from our own web index (internally named "Teclis") and news index (internally named "TinyGem").

              The marketing at least is pretty good -- I haven't seen this many committed devotees since the iPhone was announced.

              But here on the ground, I can type a few queries and reveal that Kagi is the same shit as Google. Perhaps you are seeing some manually adjusted queries?

              P.S. there are lots of other concerns about Kagi too, I'm just talking about the search https://hackers.town/@lori/112255132348604770

              • eitland 2 years ago

                > But here on the ground, I can type a few queries and reveal that Kagi is the same shit as Google.

                Here on the ground in Norway I just say "lucky you" if Google serves the same results as Kagi.

                My experience is Google expands my search terms beyond recognition and I am absolutely and utterly fed up with starting to read an article about a problem I searched for, getting a result from a trustworthy site, starting to read only to realize Google secretly behind my back expanded my search from "unpopular-ancient-javascript-framework-from-2021" to "currently-popular-js-framework-with-kinda-similar-name".

                My time is valuable. Not having to babysit my search engine is something I happily pay for.

                Could probably get my employer to pay for it too.

                If Google by some miracle has you in an experimental group where they don't mess with your queries, or if you have time to babysit it - more power to you.

                • p3rls 2 years ago

                  Why do Kagi supporters keep insisting I am supporting Google? I save all my best vituperative comments for how Google has destroyed the internet.

                  I am only saying Kagi is just as shitty as Google for my queries and certainly is using their index based off the 30 or so searches I made. I am not in a technical niche however, so maybe they're using something different for that or you saw a manually-adjusted query?

                  But for the regular, everyday people queries, yeah 100% it's just google bro.

                  • eitland 2 years ago

                    Ok, thanks for the clarification.

                    Let me also clarify: I am absolutely not denying that Kagi uses Googles index.

                    I am just pointing out that for some reason when I search with Google I get lots and lots of results for things I didn't search for. For a while I kept them away by sprinkling doublequotes everywhere but the last few years, no amount of double quotes or even the verbatim operator works anymore.

                    With Kagi however not only does doublequotes work (for now at least), - I usually don't even need to apply them, my queries work just like they used to do in "old" Google back in 2009.

                    That is what I pay Kagi for.

                    I can be totally open that for what I know the reason might be as simple as the fact that Kagis API access gets results directly from Googles search API without passing through what I think of as a "search broadening" function and without applying A/B testing experiments to the results that come back.

                    I don't care. I get my results.

                    And if someone come along and present a better alternative that uses only their own independent index, I'll be inclined to pay even more just because I'd love to see it. (I was a marginalia supporter for months and might become again in the future.)

          • bitcharmer 2 years ago

            > Kagi takes Google's results and then makes them worse

            And this is how I know you haven't used Kagi at all.

            • p3rls 2 years ago

              > And this is how I know you haven't used Kagi at all.

              I guess the identical search results in my niche are an example of convergent enshittification of both Kagi and Google then

              • bitcharmer 2 years ago

                Ok, if that's the case then I give up. Totally possible for different people to have wildly different experience.

    • abhijat 2 years ago

      I switched to Kagi in June last year. I just realized I tried it initially because I wanted to try out blocking sites in search results, and I have only ever needed to block three domains.

      • eitland 2 years ago

        Thats exactly what I am talking about.

        Kagi is kind of like Google in 2009, seriously good coverage, good ranking

        ... but also:

        - more modern

        - more features (summarizer, bangs like in DDG, FastGPT and probably a few I forgot)

        - blocklists for websites (and also options to pin, raise and lower)

        - with actual support: report a bug and you get an answer from a real engineer, a follow up when it is fixed and a shout out in the relevant release notes

        - no tracking

        • elorant 2 years ago

          I use FastGPT quite often although I’m not a subscriber to Kagi itself. For me it’s everything an AI search engine should be. Here is your answer, and here are a bunch of links to research further. Something that works without making the web obsolete. Not like the walled off garden of OpenAi which often hallucinates links, or Google’s “I through everything at the wall to find what sticks” effort.

        • dustincoates 2 years ago

          I like Kagi a lot (just look at my comment history), but I'm letting my subscription lapse when it comes time to renew. I've found myself going to Google a lot more often, and I'm finding more and more transparently spammy sites in the Kagin index. Some, for example, are clearly Gen AI created.

          If I were a rich man, I would probably keep my subscription just to support a Google competitor. Alas, I'm not, and so I'll be going back to Google.

          • eitland 2 years ago

            I see your point. I was not always this well paid.

            Did you try blocking the problematic sites and if it didn't work, did you file support requests?

            • dustincoates 2 years ago

              I didn't block them, because they were rarely the same site repeatedly, so I'm not sure it would help.

              I didn't on support requests, either, perhaps I should. I have before, and the team did a great job at addressing what wasn't working.

          • freediver 2 years ago

            Have you reported any of this to Kagi? (things that are reported usually get fixed)

        • nolist_policy 2 years ago

          Is it only me or do these constant Kagi ads on HN sound fishy?

          • ysavir 2 years ago

            I wouldn't necessarily call them fishy, but I am very tired of them. They have a very evangelizing tone. But I think they're ultimately just people excited about the tool they're using and wanting to share it with others.

          • drpossum 2 years ago

            Maybe they're not ads but people who genuinely like the service?

          • tomrod 2 years ago

            I took a screenshot years ago where 10/14 of the viewable top headlines on my screen where positive Google discussion. From an advertising perspective it was all earned marketing (satisfied customers speaking highly).

            While these situations could be a pg-style astroturf submarine, or they could be satisfied customers (the best kind of advertising), I wouldn't necessarily say fishy (you can look at the satisfied users' previous contributions to make that judgment yourself! :)).

            Personally, I've not used Kagi, but I hear positive things from people I trust that use it. So I'll likely try it in the future.

          • epr 2 years ago

            Did we not all evangelize Google in it's early days?

            Also, none of these accounts saying nice things appear to be bots or kagi-focused in any way, so I think it's safe to assume they do actually just like it.

            • orangevelcro 2 years ago

              I don't know...my spidey sense has been going off a bit.

              Kagi has a free trial, but you have to pay, which is the difference between it and early Google.

              Of course, now we have Google ads instead, so who knows, maybe not bots.

              • eitland 2 years ago

                Go check my history. Send me a mail and I'll send a photo of the fields and the wind breaks here that you can geolocate.

                I am definitely not a bot.

                I am however extremely fed up with Google. And equally thankful that I have found something that works as well as old Google (or better).

          • JTyQZSnP3cQGa8B 2 years ago

            I never say it but here it is: for the price of 2 packs of cookies, I went from being a 1x programmer to a 1.5x programmer without doing anything. If the results are good, it’s good for me and my job which brings me way more money and satisfaction than $10.

            The alternative is Searx and I may try it sometimes, but so far Kagi is cheap and very efficient for me (C++ coding and other languages).

          • danielheath 2 years ago

            It’s not surprising that folks who pay for a service when there’s a free alternative are pretty serious fans.

          • breakfastduck 2 years ago

            Google Search being a bit rubbish has been in the zeitgeist for a while, it's not surprising that people then talk about an alternative they've found that is much better in their experience

          • mannycalavera42 2 years ago

            kagi is the new crossfit

    • beeboobaa3 2 years ago

      Is kagi good for finding things like old forum posts (not reddit)? I know some of those sites are still up but google seems to ignore them.

      • nalinidash 2 years ago

        Try search.marginalia.nu

        From the website about: "This is an independent DIY search engine that focuses on non-commercial content, and attempts to show you sites you perhaps weren't aware of in favor of the sort of sites you probably already knew existed."

      • eitland 2 years ago

        There is a seperate "lens" (think like "images", "videos" and "news" in Google, only there are more and you can create your own) for "Small Web" which only includes what they describe as "Results that favor noncommercial domains and topics".

        (Other standard lenses include

        - Forums

        - Fediverse forums

        - Usenet/archive

        and I think 7 others.)

      • stuffoverflow 2 years ago

        In my experience kagi is decent, definitely gives more forum results than google. I've found yandex to be the best at finding all kinds of forum /discussion sites.

    • packetlost 2 years ago

      I dunno, the first thing I did was blacklist G**ksf*rG**ks from my search results (and others, of course) and I couldn't be happier.

    • esperent 2 years ago

      Kagi is worth the money, but it isn't magic. It's about as good as Google was ~five years ago, before they made all the search operators stop work. There's also a whole bunch of things it's worse at that Google - especially local search and shopping. Plus I still get plenty of blogspam and AI generated crap from Kagi.

      • mozman 2 years ago

        The search operators makes big difference in result quality, I also don’t like how Google now returns zero results for something obscure. In the past I could find something peripheral and eventually get to what I was looking for.

        • orangevelcro 2 years ago

          Exactly! It's so frustrating because the results were usually useful enough to at least point me in the right direction or help me figure out what I was doing wrong to find what I wanted.

    • karma_pharmer 2 years ago

      Kagi is simply reselling google search results.

    • amelius 2 years ago

      How do you know that Kagi won't become as bad as Google at some point?

      • duckmysick 2 years ago

        What's the argument you're trying to make?

        That because there's a chance Kagi will become bad, there's no point in using it now and thus we should stick to Google, which is already bad? That doesn't make sense.

        The same line of thought can be applied to anything. We don't know what will happen in the future, therefore we can't be sure that things won't go bad. Is there even a way to have such a guarantee?

        Assuming I don't want to use Google (because it's bad) and I won't use Kagi or Perplexity or others (because they might become bad), what's the realistic solution? Roll my own search engine? I don't have resources and I don't trust the future me to maintain it.

      • breakfastduck 2 years ago

        We don't, but a model where a user pays for a service rather than being free and ad supported is significantly less likely to enact user unfriendly changes.

        If the way you make money is by convincing people to pay, you are highly incentivised to make the product good, especially where there are many other free competitors who are ad supported.

        • glitchc 2 years ago

          Tell that to cable companies, sports broadcasts and now streaming services. The one incontrovertible fact of modern society is that ads eat everything else. No other revenue source compares.

          • karma_pharmer 2 years ago

            ads eat everything else

            Well duh. Because it's the only value you're allowed to collect from retail customers without having to deal with either chargebacks or kyc.

            We've deliberately made every other form of micropayment infeasible. It shouldn't be surprising that the only one left is insanely popular.

            • sseagull 2 years ago

              Partly. But also because it's just a way to make more money. If you make X from payments, and you can make Y from ads, then X+Y > X.

              You might lose some people, but probably not many - we've been conditioned to really accept ads everywhere.

      • eitland 2 years ago

        Kagi is worth the price every month, unlike a number of other things I have supported it is not an investment for me in the hope that it will one day be worth it but rather a service that I pay a small sum for which in turns removes lots of frustration from my life every month.

        If they do become bad then at least I have had a fantastic search engine for another few years of my life like I had from 2002 until 2009 ish.

        And also, already at this point, they and marginalia has proven that it isn't impossible to enter the search engine marked even now. This was long considered impossible, at least here on HN.

      • spacebanana7 2 years ago

        Also it’s unlikely Kagi will ever become big enough for SEO people to specifically target with manipulative content.

        Even if they got 100 million active paying users it’d still be a tiny fraction of overall search traffic.

        • eitland 2 years ago

          If anything, Kagi has proven that Googles "SEO ruined it" argument isn't valid.

          In the beginning, Kagi was mostly just API calls to Google and Bing AFAIK. Their results were still better for some reason, probably because they didn't have to consider how valuable the ads on garbage site were before throwing the site out of my results.

          • spacebanana7 2 years ago

            I agree with you to some extent. For example, there's no reason I can think of why Google couldn't allow user selected domain blacklisting or custom bangs.

            But I fear that if a large search engine copied some of Kagi's filters/lenses for non-commercial blogs & forums then SEO agencies would dedicate immense resources to polluting them.

            • freediver 2 years ago

              Small Web is non commercial sites only, by definition. And the list of included sites is manually curated. There is nothing SEO can do to infiltrate it.

        • Kye 2 years ago

          It doesn't have to be big if it's rich.

          People will target Kagi because it's packed to the brim with people who have $5-10/month to spend on a search engine, and likely have $5-10/month to spend on other things too.

        • p3rls 2 years ago

          Why would they? The same manipulative techniques that work in my niche on Google seem to work even better on Kagi

  • Workaccount2 2 years ago

    The fundamental problem with the Internet is that people don't want to pay for things on it.

    No matter what, whatever we ended up with was going to be shitty and exploitive.

    • eitland 2 years ago

      Now you have a chance. Kagi is there.

      I made my decision two years ago and I would probably do it even if it was just on par with Google, to support competition and to avoid supporting Google.

      But in hindsight it is just exeptionally much better. There is no going back unless Kagi does something monumentally stupid.

      • jacob019 2 years ago

        I'm a Kagi user too. I like your enthusiasm, but I can't say it's been all that life changing for me. DuckDuckGo is ok too, I still use it on some machines when I don't feel like logging in. GPT has been more life changing.

        • eitland 2 years ago

          Don't know what I am doing wrong but except the first few times I really didn't get good results with Kagi.

          It might be that I rely a lot on precise queries and doublequoted words and also cannot get myself to use the conversational style that at least Google seems to prefer now.

    • tjpnz 2 years ago

      How much of that is due to ad-tech companies like Google conditioning people into thinking that way? What if online payments weren't so god awful and allowed people to throw in a few dollars as easily as they might at a toll booth? That's still an unsolved problem too. Credit card companies have solidified their involvement in every facet of the process and the alternatives are non-starters for frictionless commerce.

      I'm still happy to put my money where my mouth is and do pay for services which are genuinely useful to me. But this is not the kind of internet I imagined when growing up.

    • L-four 2 years ago

      It's not that people don't want to pay it's that it's difficult to pay small sums. The web browsers could solve this problem but they make money from ads so it's not in there best interest.

  • wslh 2 years ago

    Google was really great and revolutionary, they helped zillions of small companies to thrive. It was another cycle.

    Then, now, it is like media before the 90s: you need to pay a lot of money to be in the center page of the newspaper.

    But, hopefully we are talking about LLMs now, seems like one of the answers to search engines in general. Beyond AI, I see LLMs as a good evolution from PageRank.

    A little bit general but lately I use the expression: "Complexity as Scam". Google always pointed to their "algorithms" and played with this term as if algorithms couldn't be adjusted to whatever you want to be. Initially the coined term was sound because it was based on a scientific paper and eventually it evolution but it seems like the PageRank original idea has detoured from being a "pure" graph algorithm.

    Another context where I use "Complexity as Scam" is Web3. It is like Matryoshka dolls where there is always one more step of complexity to probe a point, but it never ends.

  • benterix 2 years ago

    It's not black and white. There was a lot of junk that was forced on us and that was removed thanks to Google. But I agree the direct relationship is inherently corrupting.

    • GTP 2 years ago

      Larry Page and Sergei Brin even stated very clearly in their original paper that using ads as revenue source can impact the quality of results returned from the search engine.

  • DarkNova6 2 years ago

    You mean the way Google worked originally? The founders were very careful in creating a barrier between ads and search.

    A barrier whose erosion has been well documented over the last 10 years.

    • vouaobrasil 2 years ago

      A barrier whose only purpose was to establish trust so that it could be later taken advantage of.

      • DarkNova6 2 years ago

        As much of a cynic I typically am, there is a well established record of events which shows that this is not true.

        Google search was taken over by an ambitious clique of failed yahoo managers that successfully destroyed their former company for their own financial advantage then did the same at google.

        Acting as parasites on society at large.

  • heresie-dabord 2 years ago

    Instead of a semantic Web of knowledge, we got "grep the HTML... with ads".

    • josefx 2 years ago

      You dropped the -v . Modern day Google seems fine tuned to return results that contain everything except for the words I searched for.

  • greg_V 2 years ago

    I mean... maybe, but not really. The first problem of the internet was that there wasn't that much content specifically. The first internet companies were the broadband providers who were developing content themselves, like AOL.

    Google and the ad ecosystem they acquired was basically the flywheel that spurred content creation at scale. Anyone could jump in, follow a few guidelines and earn a living by producing content on the internet. The Youtube acquisition and monetization followed the same pattern.

    Over time the market consolidated and got less and less competitive: less platforms with complete control of traffic and one-sided revenue sharing agreements. The guidelines so to speak on how content should look and feel like were algorithmically made stricter and stricter until everything looks, feels, sounds and reads the same.

    The problem right now is that the platforms are still tightening their grip, and it's all tied to the approach of using AI to replace the content creators on the platforms from Google to Spotify to Meta, and carving the spared money to shareholders. And while the web has been shitty for a few years now, we're now seeing a sudden drop in quality because the average user has no recourse or alternative, and neither does the average creator have the means of distribution and monetization (not just publishing, that's been solved) to even find, let alone meet the new kinds of demand.

    I'm certain that in a few years this will even out: new search engines, new aggregators and new feeds will emerge, but the content - money - network problem triangle remains as a fundamental problem of the internet.

  • linsomniac 2 years ago

    Did you experience the Internet before google? The idea of a world where Alta Vista won is truly chilling.

    • thsksbd 2 years ago

      You mean a world where people still knew how to use a library catalog, still relied on more than one source of information and curious crazy tid bits are still out there?

      The internet is boring. And the trash is still there. Its just become reputable instead.

      • linsomniac 2 years ago

        There's a lot to unpack here...

        Can you expand on how a card catalog improved the world? As a kid I used the card catalog a lot, both the physical version and the later electronic versions. Full text search definitely leads to pulling in information from a wider selection of sources.

        I remember a lot of stratification of news sources pre-Google (which news channel you watched, which papers/magazines you read). Did Google cause reliance on one source of information, or does Google simply exist in a world where people tend towards echo chambers? How would Alta Vista have improved that?

        • jareklupinski 2 years ago

          > how a card catalog improved the world

          i was lamenting the library recently and remembered how books were organized on shelves using the dewey decimal system

          this meant that if you knew of one book that has something you're looking for, you can find all the _other_ books about the same topic right next to the first book, even if you knew nothing about the contents of the related books

          • staticman2 2 years ago

            Libraries still exist and sort books like that, don't they?

            • yencabulator 2 years ago

              I believe the point is that the commenter wishes more things operated like that, where adjacent items were related and likely of similar quality, not just boosted with money and SEO trickery.

            • thsksbd 2 years ago

              They're being ripped down and replaced by coffee shops with plugs and wifi

      • badpun 2 years ago

        > still relied on more than one source of information and curious crazy tid bits are still out there?

        I think the curious crazy tid bits are still there.

    • washadjeffmad 2 years ago

      I'd be okay with a world in which everyone else in search didn't lose, too.

    • msk-lywenn 2 years ago

      In some way, didn't Google become Alta Vista?

      • linsomniac 2 years ago

        How so? My memory of Alta Vista was so-so search results with a top page littered with garbage.

        • eitland 2 years ago

          Exactly like Google since 2011 +/- 2 years (the so-so search results part) and Google the last few years (the littered with garbage part)?

          I have used Google since probably 2001-2002 sometime and a number of other search engines before. It is rather obvious to me and I have writing and screenshots from around the time quality took a dive that supports it.

          • marginalia_nu 2 years ago

            I think over time the story has become that AV and Yahoo were overtaken by Google because the latter had a cleaner design, which if you think about it for more than a moment doesn't really constitute much of a moat.

            • eitland 2 years ago

              Google was also technically better when it released and importantly there was no way to buy ones way to the top.

              Back then I think people mostly agreed that the results that ranked on top in Google generally were there because of criteria that benefited the end user.

              I mean, even as crazy as Google results has been the last few years they have still managed to be better than Bing and DDG who both manages to have worse ranking and also like Google completely ignore my doublequotes.

        • solardev 2 years ago

          Exactly

        • linsomniac 2 years ago
        • mastercheph 2 years ago

          Google

    • vouaobrasil 2 years ago

      Yes, I did! I used to use Yahoo search where the results were more hand-curated and people did not create websties for intensive commercial purposes with useless SEO fluff like it is today.

      • linsomniac 2 years ago

        I have been thinking a lot about Yahoo (pre yahoo-search, largely) lately. I don't fully understand how we lost the curated catalog, especially considering the success of Wikipedia. The latter demonstrates users willingness to curate knowledge bases... We have "awesome" lists, but I rarely seem to use them.

        • xnx 2 years ago

          Yahoo directory was largely pay-to-play and very affected by proto-SEOs trying to game the system for direct traffic or PageRank value. Collaborative directories like dmoz.org suffered the same fate before shutting down.

          • A_D_E_P_T 2 years ago

            Pay-to-play is actually a very decent business model for a directory, though. Pay a one-time fee, establish that you're a legitimate business or website, undergo a quality assessment, and you're in. There's no paying for extra promotion, e.g. with ads, and website owners don't need to warp their sites in anticipation of what they think the search engine wants. (Which often results in a decrease in usability, e.g. all of those recipe sites with low-quality 2000-word essays before they get to the actual recipe.)

            It's a more level playing field, and it's intrinsically more human-friendly.

  • blowski 2 years ago

    I imagine it would be a different flavour to what we have today, but the same intensity. Anything that so deeply penetrates daily life across the globe is going to bring enormous problems with it.

  • 1vuio0pswjnm7 2 years ago

    There is something truly strange about the idea than people "trust" a website operator and can rely on it to provide them with useful information when that same operator is well-known to be secretive, deceptive and dishonest in order to protect its own interests. It's like imagining that a fact witness who tells the truth on some occasions and lies on others is credible.

    https://ipullrank.com/google-algo-leak

nsmog767 2 years ago

I work in search and didn't find anything surprising in here. But that's mostly because I've just assumed Google has been lying for years about many things, such as not using click data or Chrome data.

I've directly seen people who have successfully manipulated search rankings by having logged-in chrome users search for a term, and then click on a given page. Works like a charm (though may not stick once the manipulation is done, unless organic users also prefer it).

ec109685 2 years ago

If anyone is surprised about chrome sending urls to Google, you can turn the “feature” off by unchecking “Make searches and browsing better” in the sync section of Google chrome settings.

Creepy.

  • HenryBemis 2 years ago

    Or, and hear me out, you never use Chrome again, in any platform.. like ever ever again.

    • smegger001 2 years ago

      I only have chrome installed for a couple of work related sites that don't display correctly on firefox. I dont get to choose not use the work related site and MS edge likely isn't any safer and also is not available on my choice of operating system

  • Terr_ 2 years ago

    "But what if I don't want my own computer to build and share a detailed profile of everyone I know, everywhere I go, all my preferences, and how to manipulate me?"

    "Well obviously it's your fault for not picking the 'Don't Be Cool' option on subpage 27b-6, duh!"

    • ralfn 2 years ago

      Yeah. It's victim blaming. Reminds me of "they should have shouted louder".

      The confusing thing is the crime itself is small on an individual level. The question is: does it add up cumulatively if a small crime is committed against many?

      • juleiie 2 years ago

        A small crime can result in massive power. Knowledge is power.

        Barring the ethics you can single handedly use such data to manipulate stock market, countries etc.

        It’s just too much power

      • kulshan 2 years ago

        I don't know if it's "Victim Blaming"...I teach Digital Literacy courses for seniors new to technology. While I do set them up with Firefox and Ublock, we generally have them use Gmail as they are all Android Devices. Google sends a confirmation email to walk each one of them through their security settings. Of course most users just ignore this email (like I used to have students do) but now we go through it and uncheck this setting in all my courses, and unpersonalize ads as well. Feel like the most basic user who has even the tiniest concern of data privacy should know how to look at their Google Account settings. These are 80 year olds who don't even know what a "click" is but they know to be skeptical of using Google.

        • out-of-ideas 2 years ago

          please also explicitly teach folks to re-visit settings frequently; apps/webui's love to change settings, and often opt-in to new "features". one thing i feel is underrated is the frequency at which those settings change on users for the company's benifit

  • andrybak 2 years ago

    > unchecking “Make searches and browsing better”

    Before that, you can make it audible: <https://github.com/berthubert/googerteller>

  • precompute 2 years ago

    Is that part of Chrome not open-source?

  • noman-land 2 years ago

    Imagine thinking you can escape your abuser by living in their house and asking them politely to stop.

thih9 2 years ago

> Thousands of documents, which appear to come from Google’s internal Content API Warehouse, were released March 13 on Github by an automated bot called yoshi-code-bot

Does anyone know more about yoshi-code-bot and how were these documents suddenly published?

Was it a script misconfiguration? A manual push? Something else?

ilrwbwrkhv 2 years ago

And that's why if a developer doesn't use Firefox and uses Chrome, they are just helping a monopoly take over everything and make a mess.

  • dgellow 2 years ago

    Any user, not just developers

    • olliej 2 years ago

      Developers just replaced IE as the only thing they develop for with chrome, users then _have_ to use chrome because of web developers who only develop for chrome and consider any behaviour other than "it works in chrome" as a bug in other browsers, just as they did with IE.

      Then there's the relentless parade of "alternative browsers" that are just chrome skins - a period IE also went through - that intentionally try to trick people into believing they're not just using chrome but with less security engineering, and more scams.

      • barbariangrunge 2 years ago

        It became trendy recently to break compatibility with Firefox. Blogs almost bragging about how they boldly made the choice. Very embarrassing stuff

        • pseudalopex 2 years ago

          Do you have examples? I would like to see how they talked about it.

          • barbariangrunge 2 years ago

            It’s on hn here or there, or in random blog posts shared in here. Just keep your eyes open for a bit and you’ll see something pop up

      • dgellow 2 years ago

        You’re conflating lots of unrelated things. IE was a horrible browser to support because Microsoft deliberately implemented their own incompatible version of web standards, or refused to implement modern standards. The push to deprecate IE was because it was creating a massive burden, I personally dealt with IE6 support in corporate world and can attest it’s depreciation was necessary.

        What you call chrome skins isn’t a thing, people are building softwares on top of Blink, the rendering engine used by Chrome. The issue here is the risk of ending with a single rendering engine for the majority of the browser market, a diversity of engine ensure a good respect of web standards, that has nothing to do with privacy or security.

        When you say “they just replaced IE”, that was >10 years ago…

        • jiggawatts 2 years ago

          You’re responding to someone complaining about an overly authoritative government by saying that you don’t see the problem, it’s just that the local police force tortures people with downright medieval techniques.

          • dgellow 2 years ago

            What

            • olliej 2 years ago

              While I obviously disagree with your prior comment, I feel "What" is a pretty much perfect comment here. +1.

              What.

              • jiggawatts 2 years ago

                It seems like my analogy was too difficult to follow.

                The point is that the fundamental problem is a company in a monopoly position throwing their weight around and/or keeping their product stagnant “because they can”, which is also a function of power. This is the “authoritarian government”.

                The police part is referring to people thinking of specific IE6 technical issues “as the problem”, when it’s just a symptom of a larger problem.

                Microsoft treated web developers badly because they could. Google will abuse the whole world in the same way now that Chromium has achieved near total dominance.

  • metadigm 2 years ago

    As soon as they add the ability to configure shortcuts, I'd more than happy to. After several years of requests, we're finally seeing some movement on their end.

precompute 2 years ago

From the article:

Boosting "organic traffic":

- Brand matters more than anything else

- Experience, expertise, authoritativeness, and trustworthiness (“E-E-A-T”) might not matter as directly as some SEOs think.

- Content and links are secondary when user intention around navigation (and the patterns that intent creates) are present.

- Classic ranking factors: PageRank, anchors (topical PageRank based on the anchor text of the link), and text-matching have been waning in importance for years. But Page Titles are still quite important.

- For most small and medium businesses and newer creators/publishers, SEO is likely to show poor returns until you’ve established credibility, navigational demand, and a strong reputation among a sizable audience.

TL;DR: Clickbait + bot farms are the way to go. No wonder the internet is going to shit.

BillFranklin 2 years ago

FYI, it's much easier to read the linked GitHub code via the published docs at https://hexdocs.pm/google_api_content_warehouse/0.4.0/api-re...

  • BillFranklin 2 years ago

    In particular, https://hexdocs.pm/google_api_content_warehouse/0.4.0/Google...

    Notably, for people on HN, it looks like there is indeed an internal initiative to promote small personal blogs :-)

    > smallPersonalSite (type: number(), default: nil) - Score of small personal site promotion go/promoting-personal-blogs-v1

    • SquareWheel 2 years ago

      Well, maybe. It's a factor that a twiddler can influence, but we don't know if that's done positively or negatively. It might also be more conditional, like for specific types of queries.

      For example, a small, personal blog might be great for solving a specific technical problem ("my dishwasher of model XXX has YYY problem"), but might be terrible for something like giving public health advice.

    • iamacyborg 2 years ago

      We don’t know whether that particular module was used to promote or downgrade small sites in the SERPs.

llmblockchain 2 years ago

> GoogleApi.ContentWarehouse.V1.Model.AppsPeopleOzExternalMergedpeopleapiAboutMeExtendedDataPhotosCompareDataDiffData

Java, is that you?!

isaacfrond 2 years ago

Most of the factors in ranking a page are no surprise. But i was surprised that having Product reviews on your site is apparently a demotion? Surely, many people are searching to find just that?

  • unnamed76ri 2 years ago

    Years ago I had a site for deep fryer reviews. The whole thing existed to make money from Amazon’s affiliate program. I hadn’t personally used ANY of the deep fryers. Was just writing reviews based on features and other people’s reviews. In short, I ranked high in Google and added nothing of value to the world with that site.

    There was a brief period of time where I made decent money with it until Google deranked all the product review websites.

  • b112 2 years ago

    This is likely more about reviews with affiliate links. 99.99% of those are people reviewing absolutely nothing, just copying reviews and putting their own affiliate link.

  • zeroCalories 2 years ago

    Sites spam low quality product reviews with affiliate links to Amazon. This is done by "reputable" sites as well. I don't blame Google for down ranking this meta.

  • nottorp 2 years ago

    We are, but I’m not sure there are any real product reviews left on the internet.

    • sidewndr46 2 years ago

      Other than reviews of Google search itself obviously

      • nottorp 2 years ago

        Are there? I can't [1] write an objective review, I can just subjectively say that it's been more and more useless to me in the past ... 7-8 years now?

        [1] Or maybe can't be bothered because I stopped caring ages ago.

  • cqqxo4zV46cp 2 years ago

    “xx,xxx five star reviews” I’ve found is a modern day over-marketed product trope. It feels well within the realm of reasons that this ends up serving as a useful heuristic.

  • yieldcrv 2 years ago

    I don’t trust conflicts of interest, if that’s about a site selling it’s own product and having reviews, I’m glad to find that results in a demotion

    While bigger marketplaces have other ways of driving ranking

  • ren_engineer 2 years ago

    most of these have been outright publicly denied by Google employees, despite people showing with A/B tests that things like CTR and backlinks impacted rankings

skilled 2 years ago

I would usually call this a dupe but this article and the other one from SparkToro are completely different even if they are on the same topic.

Haven’t had a chance to look at the API myself but the first impressions are that a lot of this was suspected by SEOs, but Google kept rejecting the ideas. Looks like clicks increase ranking for sure, which means click farms definitely have a legitimate business solution to offer.

JSDevOps 2 years ago

Seriously considering switching back to Firefox after all these years.

  • jasonsb 2 years ago

    What's stopping you? I use both browsers and I see no reason why someone would pick Chrome over Firefox at this point in time.

    • 4gotunameagain 2 years ago

      While the reasons someone would pick Firefox:

        - Privacy
        - Tree style tabs
    • blitzar 2 years ago

      (Some) sites don't work on Firefox.

      Sure it isn't frequent, but it is frequent enough that once a day or so I have to open chrome to do something.

      • elaus 2 years ago

        Seriously curious what sites those are, especially if it's not the same page every day. It literally never occurs to me (using Firefox again since 3-4 years) but I mostly browse dev-related websites.

        • TheSalarian 2 years ago

          https://business.apple.com simply throws an Unsupported Browser error when trying to visit from Firefox. Unfortunate, but it's another one of those "gotta do it for work" things I deal with.

        • redblacktree 2 years ago

          One example I often run into: Plaid (the bank-linking company) doesn't work. Just hangs. Though I'll admit it's possible they fixed it. I've been trained to use Chrome when I have to interact with it.

        • komali2 2 years ago

          Some government and hospital websites in Taiwan.

          Some startup websites I applied to.

          Payment portals are a big one. Non-stripe or PayPal. But even there, if it's a new window payment flow, there can be issues on Firefox.

        • nonameiguess 2 years ago

          I have the same problem. I certainly don't use Chrome daily, but do have to keep it around, typically for shopping checkout and a fair number of US government websites don't work on any browser but Chrome.

        • hughesjj 2 years ago

          I've been daily driving Firefox since quantum and up until 2012, but 2fa registration is still needlessly locked out on some sites (need to open up chromium to register the key)

        • dmitrygr 2 years ago

          jlcpcb's site is often broken in firefox, sadly. i keep chrome around just for it.

          • brokenmachine 2 years ago

            Worked for me when I've used it in Firefox. What isn't working?

            • dmitrygr 2 years ago

              Try ordering a PCB or seeing the list of house parts for PCBA

              • brokenmachine 2 years ago

                Not sure where the list of parts is, but I've ordered PCBs using Firefox.

                • dmitrygr 2 years ago

                  I used to as well, sometimes it works and sometimes nothing on the page is clickable. New FF profile does not help, chrome works every time and their support has told me many times "we only chrome support sir"

      • ilikehurdles 2 years ago

        Once a day? That’s huge. What sites? (I use Firefox daily for about the last year and haven’t had this kind of issue)

      • sangeeth96 2 years ago

        ICYDK, do consider reporting on https://webcompat.com if you see them.

      • Nuzzerino 2 years ago

        Have people never heard of Brave?

    • thisisit 2 years ago

      for now the seamless extension switching using Extensity. I am yet to find an extension on Firefox which can deliver this functionality.

    • metadigm 2 years ago

      No shortcut configuration.

  • GuB-42 2 years ago

    I have used both for many years, and now, I see little difference in practice. I am leaning more towards Firefox these days. Main change is that I now use Firefox as my main mobile browser for ad blocking reasons. A few websites don't work on Firefox, I use Chrome for these few.

    I don't consider it a problem to use two browsers at the same time, I usually don't to the same thing with them, so having separate profiles can be an advantage.

    Note that privacy is not the reason why I am using Firefox. It is just that I think that knowing both is a good thing, and they are both good browsers, so why not? In some case, Firefox is better, in others Chrome is better, most of the times, they are interchangeable.

  • mind-blight 2 years ago

    I've been using Firefox since Chrome forced users to sign in to the browser with their Google account, and I'm quite happy.

    The only time it's a problem is when a site detects Firefox and won't display unlocked your using chrome or IE. I've only seen that a couple of times in the years since I switched back

    • Frank2312 2 years ago

      Even in that case,there are Firefox extensions to change your user agent. Suddenly the app requesting Chrome/Edge works perfectly, even though we are running in Firefox.

    • kernal 2 years ago

      How did Chrome force you to log in? I've been using it signed out for the longest time.

  • WhyNotHugo 2 years ago

    Firefox is better than Chrome [in the privacy aspect]... but still pretty terrible.

    It sends a lot of "analytics" and "tracking" to some of Mozilla's servers, but if you inspect the requests, those servers are actually behind Google's CDN,and Google does the TLS termination.

    So... Google has access too all the data that Mozilla sends when it phones home. Some of it even has a unique identifying id.

  • Ringz 2 years ago

    I've been using Firefox since the days when it had the other name. Meanwhile, I use Floorp [1], which is based on Firefox, but offers much more possibilities for customization. I am very satisfied, except for the stupid name...

    [1]: https://floorp.app/en/

  • rpgbr 2 years ago

    Go for Firefox and keep ungoogled-chromium[0] for those sites that refuses to work properly on non-Chromium browsers.

    [0] https://github.com/ungoogled-software/ungoogled-chromium

  • garbagewoman 2 years ago

    … just considering?!? What is it gonna take

9dev 2 years ago

I found it interesting that the docs mention "site2vec" scores. This implies, I think, a variant of word2vec or document2vec, but for the full site; so probably a vector sum of the doc2vec scores of all individual pages?

HankB99 2 years ago

> Successful clicks matter.

I wonder about this. If I click a link and read it and I find that it's garbage (e.g. got ranked based on SEO rather than useful content) does it count as a successful click? Worse yet, some of these sites have blatant errors that are only discovered after examination.

This is relative to technical subject matter. Other searches, such as shopping may not suffer this kind of problem (or I have not noticed it.)

I also wonder how Google knows a click is successful. If I open a link in another tab, does the browser tell Google how long I lingered on the site? Perhaps Chrome does but I use Firefox.

  • EcommerceFlow 2 years ago

    Once you get to the top 1-3 results, CTR (click through rate) is a much bigger ranking factor. Google knows how long people stay on pages and whether they click and back out immediately. This is important for E-Commerce, because Google doesn't want Site #1 to be mostly out of stock even though they have better links.

    • HankB99 2 years ago

      > Google knows how long people stay on pages and whether they click and back out immediately.

      What if I <ctrl><click> to keep the search page open and open the "found" page in another tab?

      • yencabulator 2 years ago

        Can the on-page javascript detect the difference between click and control-click? If so, you can count just the former, and wait for the back button press, to get a sense of visit duration.

        I think control-click is a power user feature that they just don't care to track. Average consumer is the target audience of the advertising...

badgersnake 2 years ago

Something like this I guess:

var words = query.split

var results = executeQuery( Select * from AdWords aw where word in query inner join adlinks al on aw.id = al.id return al.url, al.desc)

If (results.size < 30) { // todo call search engine }

Return results

ilyazub 2 years ago

It doesn't look like a leak but a misdeployment.

Same service wrappers from two years ago: https://github.com/googleapis/google-api-php-client-services...

usui 2 years ago

> Prior to the email and call, I had neither met nor heard of the person who emailed me about this leak. They asked that their identity remain veiled

And yet the journalist included a screenshot with one of the weakest blurs I've ever seen... Why would you not excise the person's video portion completely? What good does it serve to have it included in the story? Even if that portion is faked, why would you offer potential signals like skin complexion, hair color, background picture, etc.? Why...

  • mtlynch 2 years ago

    The author is Rand Fishkin, who's not a journalist. He's the founder of SparkToro and Moz, both companies that provide tooling and analytics for SEO.

    I haven't looked deeply into Fishkin's companies, but I wouldn't expect either to be on the user's side when it comes to privacy. Both companies seem to monetize clickstream data and personal information from users who probably didn't give informed consent.

    If the source was trying to get this information to a responsible journalist who cares about privacy, I have no idea why they'd approach a company (not even a news organization) who seems to fund the erosion of user privacy.

    • phs 2 years ago

      > Both companies seem to monetize clickstream data and personal information from users who probably didn't give informed consent.

      I don't think you know what you're talking about. During Rand's tenure Moz was a subscription business selling access to marketing analytics tools. Those tools focused on the structure of the clients' sites themselves rather than any analytics they might have consumed.

      Source: I worked at Moz for several of those years, and helped maintain those tools.

    • yencabulator 2 years ago

      And since then, the person on the call has revealed their identity. This was an SEO bro talking to an SEO bro about something they found on Github, not an insider leak.

  • krackers 2 years ago

    >weakest blurs I've ever seen

    Isn't this the same type of "swirl" blur that Interpol was able to reverse even 10 years back? With advancements since then you're basically handing evidence on a silver platter.

  • txomon 2 years ago

    To make it worse, he made clear when the call had happened, and you have: 1) Who was in the call 2) When the call happened 3) A blur instead of a complete black out

    I'm not sure I would feel safe reporting stuff to journalists nowadays.

  • roastedpeacock 2 years ago

    That also struck me as odd. And seemingly a violation of journalistic best-practices of protecting sources. I sure hope this was done with consent of the anonymous source.

  • Control8894 2 years ago

    It's a fake background.

    It's also clearly from Google Meet so... yeah. If he was worried about retribution (from Google, anyway) then they probably wouldn't have been using a Google service.

adrianvincent 2 years ago

The algorithm is probably so complex and bloated at this point I doubt even Google knows how it really works

adamgordonbell 2 years ago

Where is the link to the document?

zarathustreal 2 years ago

Hopefully this doesn’t surprise anyone..if Google actually told us correct information about how the search algorithm works it would be abused immediately

pembrook 2 years ago

What I find most interesting about this is that a lot of supposed "smart" algorithms of Big Tech are in fact a patchwork of "dumb" rules rules and human-picked winners. This would explain why the quality of search results is failing to keep up with developments in LLMs.

This also explains why it's impossible for incumbents to unseat the winners in many search categories -- because they've literally been picked as the winners by humans at Google.

Looking at my Twitter/X feed, I also see an oddly similar dynamic. Certain accounts appear to have been manually boosted, showing up all the time -- whereas others posting even the same exact content will never appear.

Silicon valley will loudly tell you all about how wonderful they are at "democratizing," however, if you look under the surface it appears they're just hand picking the winners.

alun 2 years ago

Maybe this is an unpopular opinion, but if a search algorithm is truly designed to showcase the best content, then making it transparent shouldn't lead to manipulation

8note 2 years ago

For those out of the know, what's a "crap" in this? A "crap crap"?

throwaway743 2 years ago

... why the hell would an anonymous source use google meet to share info on google? ... so much for remaining anonymous :/

jgalt212 2 years ago

> A sample of statements from Google representatives (Matt Cutts, Gary Ilyes, and John Mueller) denying the use of click-based user signals in rankings over the years.

renegade-otter 2 years ago

There are so many Kagi fans on HN that it's a matter of time before the Big G buys it and shuts it down, like hundreds of its products before.

SadCordDrone 2 years ago

Didn't read article fully, but - since it's protocall buffer definitions, what if these fields are there for backward compatibility?

Havoc 2 years ago

Does it also recommend eating at least two stones a day?

StevenNunez 2 years ago

Wait... There's Elixir to be done at Google?!

dentemple 2 years ago

TL;DR Google lies about how its search algorithm works.

  • eitland 2 years ago

    Would be interesting to see if any relavant authorities could be interested now that this is out?

    I understand some of this is a direct contradiction of things they have said in court previously?

Aldipower 2 years ago

If there are really 14,000 attributes, most of them will have a weight near 0, thus are irrelevant. If they would be all heavy weighted, the ranking would be rendered irrelevant due to the sheer amount of attributes.

  • beejiu 2 years ago

    Isn't that where deep learning comes into play?

  • ozehlaw 2 years ago

    Yes, this makes sense. I think the only good thing from the leak for Google is that the scoring values are not present

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection