Settings

Theme

I'm shadow banned by DuckDuckGo and Bing

daverupert.com

269 points by stilldyl 3 years ago · 204 comments

Reader

beej71 3 years ago

I have exactly this problem. Beej's Guide to Network Programming is indexed just fine. Beej's Guide to C won't index.

The automated tool says it's in violation of some unnamed rule, but I can't figure out which. There's zero SEO, tracking, or ads, and the content is educational and G-rated.

All the other guides index just fine.

I asked for a review and they came back with the same ambiguous message. Eventually I just gave up.

Recently I split the C guide in two. I'll have to check to see if that made any difference.

But it left a bad taste, and now I don't trust Bing or DDG to provide complete results. Google's overrun with spam, but at least my stuff actually shows up on Startpage.

  • KRAKRISMOTT 3 years ago

    It is interesting since Beej's guide is probably the most famous C/Unix programming tutorial after K&R.

  • et-al 3 years ago

    Could it be due to some overly zealous prude filter and the unfortunate coincidence that "beej" is American slang for blowjob (oral sex)?

    • beej71 3 years ago

      I've definitely considered that. (I'm well-aware of my nickname's connotations, and I just don't care. :) )

      But they index everything else on my site and don't prude out over that...

    • michaelmrose 3 years ago

      Actually outright porn is indexed you know or so I hear.

      • SonOfLilit 3 years ago

        Indexed, but only returned if the search query was explicitly and unambiguously about porn.

        • frickinLasers 3 years ago

          Not true, I've had pages of x-rated results with SafeSearch on, with DDG and Bing, a couple of times. The search only needs to be sufficiently weird.

          see: https://news.ycombinator.com/item?id=31334670

          • SonOfLilit 3 years ago

            Where "weird" involves words like "licking" and "party" (not saying it's not a bug, just that it's a statistics vs actual language understanding bug in a feature and not absence of that feature). I bet there's no way to compose all of the words "spatula", "serotonin", "pion" and "deconstruction" along with words like is/a/an/of/how/what/when that would turn safe search off, despite any query of this format would pretty weird.

      • irrational 3 years ago

        Porn is one thing, the word Beej is quite another. I wish this was a joke.

    • Johnny555 3 years ago

      That's pretty obscure slang, I'm American and have never heard it.

      • beej71 3 years ago

        I used to say, "'Beej', like 'B.J. Hunnicutt' from MASH."

        But MASH is getting a little too far removed these days. :)

      • TeMPOraL 3 years ago

        Took me a while to figure it out, but yeah - seems "Beej" is pronounced as "(Bee)(j)", which matches the pronunciation of "BJ". But I don't think it's relevant to site indexing, unless search engines started to take homophones into account.

        EDIT: but maybe they did, ever since voice assistants became a thing?

      • alar44 3 years ago

        It's very common.

        • Johnny555 3 years ago

          Is it very common?

          If I search "Beej" on Google without Safesearch enabled, I get 14.3M results, if I turn on the Safesearch filter, it still returns 14.3M results.

          If I repeat the same experiment with "blowjob", it's 1.5B results vs 23M.

          If I search for "Beej" on Bing with SafeSearch Off, I get 2,840,000 results, while with Safesearch on Strict, I get 2,800,000 results. I couldn't search for "blowjob" at all with Safesearch on Strict.

  • cyberpunk 3 years ago

    Wow it's beej. I owe you rather a lot of beers, Guide to Network Programming is directly responsible for my entire career. Sorry for the low value post! :}

    • beej71 3 years ago

      Entire career? That's pretty high praise... Well, if you're ever in Bend, OR drop me a DM and I'll start to collect. :)

  • mprime1 3 years ago

    My hero! Thank you for your work.

    (Yes, not adding much insightful conversation. I don’t care if I get downvoted.)

  • richardjam73 3 years ago

    It seems to show up for me. Perhaps it is fixed now.

    https://duckduckgo.com/?q=c+guide+stdalign&t=ffab&ia=web

    brings up your guide as the 6th result.

    • beej71 3 years ago

      I have to add "beej" to the search, and then it hits my C Library Reference Guide. But the C Tutorial Guide is nowhere to be found. :(

  • daflip 3 years ago

    i assume you have >0 backlinks to the site?

    • beej71 3 years ago

      Some rando backlink checker says it's about 1300.

    • lelanthran 3 years ago

      > i assume you have >0 backlinks to the site?

      I always wondered about this. What exactly is a backlink, and why should I need one?

      • vgel 3 years ago

        Backlinks are links to your website used by algorithms like PageRank^1 to weight how important your site is. Roughly speaking, more links, especially from sites with their own high pagerank = higher rank.

        [1]: https://en.m.wikipedia.org/wiki/PageRank

      • sethhochberg 3 years ago

        Some other website linking to your website. The general premise is that if other sites with some known reputation are linking to you, you have a bit more credibility than a completely isolated site that nobody else references.

crazygringo 3 years ago

I may have found the answer, and I've seen this before (it happened to me once). It's when a different spam site copies your content wholesale, and a search engine decides they're the "original" site, and you're the spammy copycat.

Because if you put the headlines (in quotes) from two of his recent articles into Bring, e.g. either "Megan Smith explaining the General Magic prototyping process" or "Denialists, Alarmists, and Doomists", both point as their first result to a URL starting with "https://www.scien.cx" which seems to be the spam site with a copy of each article. (The URL isn't loading right now, however, when I try to visit.)

How to fix it really depends on what techniques they're using to mirror your site, of which there are many.

Example search and resulting URL:

https://www.bing.com/search?q=%22Megan+Smith+explaining+the+...

https://www.scien.cx/2022/12/25/megan-smith-explaining-the-g...

Compare with Google getting it right:

https://www.google.com/search?q=%22Megan+Smith+explaining+th...

https://daverupert.com/2022/12/megan-smith-general-magic-pro...

  • gary_0 3 years ago

    Recently on HN there was "Someone is proxy-mirroring my website, can I do anything?": https://news.ycombinator.com/item?id=33952114

    It feels like the Internet is a more hostile place than ever for small-time websites. You get squeezed from below by wily criminals, and crushed from above by careless megacorps who want to filter out anything that doesn't make them money.

    • irrational 3 years ago

      Nowadays, when I search for things, the results are often clearly pages that have come from a program scrapping sites and then merging them into one page. You can tell because the pages are not really coherent and quickly start to repeat themselves. I assume they are getting money through ads on the pages, though I never actually see the ads because of my blockers. I wish there was a button in the browser that I could click to report the page as spam to all search engines.

      • 411111111111111 3 years ago

        > I wish there was a button in the browser that I could click to report the page as spam to all search engines.

        The spammers would use that offensively to destroy the original content creators with that system

        • CamperBob2 3 years ago

          The way it should work is that the spam reports are used only to filter the results you see.

          • TeMPOraL 3 years ago

            That's precisely why I started to use and pay for Kagi - it fetches search results from Google and Bing, but allows to prioritize, deprioritize, pin or block specific domains in your search results.

            I'm still surprised no one else seems to be offering this feature.

          • 8n4vidtmkvmk 3 years ago

            Seems reasonable. If enough human-like users (with gmail accounts, yt activity, and other indicators) ban a particular site enough times that should offer some evidence that it's low quality too.

      • LAC-Tech 3 years ago

        uBlacklist is really good for that. Just click "block this site" in the search results.

        https://chrome.google.com/webstore/detail/ublacklist/pncfbmi...

      • MandieD 3 years ago

        I don't think it reports to any other search engines, and am not sure it affects any but your own subsequent results, but Kagi has a feedback scale on each result: "block", "lower", "normal", "raise", and "pin".

    • m-i-l 3 years ago

      > It feels like the Internet is a more hostile place than ever for small-time websites. You get squeezed from below by wily criminals, and crushed from above by careless megacorps who want to filter out anything that doesn't make them money.

      The problem is that the two work hand-in-hand, thanks to the advertising driven search model, and the search engines owning the main advertising platforms.

      It should be easy for search engines to identify an original site from the SEO spammer rip-offs - the original site is going to have no adverts (or certainly fewer) while the SEO spammer copies are going to be covered in adverts. The problem is that the search engines have no incentive to do so, in fact if anything they have the incentive to send people to the sites with more adverts.

      And of course the whole problem has been created by the search engines in the first place - there would be no point in SEO spammers making advert-laden ripoff sites if it wasn't to rake in advertising revenue.

    • kshacker 3 years ago

      No more hostile than real world, we are just finding out it is a reflection of our world, of course the difference being the global interconnectedness which magnifies the celebrities but also the crooks.

      • psychoslave 3 years ago

        Well, it seems to me that none electronic outcomes tends to be slower and harder to copy at scale like digital material, doesn't it?

  • lapcat 3 years ago

    It's not clear that this is happening with all of the (many) sites that are mysteriously deindexed by Bing. See my comment: https://news.ycombinator.com/item?id=34389279

  • TrueGeek 3 years ago

    What should you do when another site copies your content like this then?

    • O1111OOO 3 years ago

      > What should you do when another site copies your content like this then?

      Have we gotten to the point where websites (and their content) need to be verified like Twitter, Instagram, Facebook, and TikTok do for personal accounts?

      If so, will search engines be the ones verifying - using this as a new revenue scheme (with the dangers inherent in this... ie; pay to be listed or ranked higher)?

      • lapcat 3 years ago

        I actually did add a Bing authentication XML file to my site, but Bing deindexed me again anyway.

        Bing is the problem. It's broken.

        • linmob 3 years ago

          Can confirm, authentication files don't help. Bing is seriously broken, and their support is all but helpful. This is an excerpt from one of their replies:

          > "Thank you for your patience during our investigation. After further review, it appears that your site did not meet the standards set by Bing to remain indexed the last time it was crawled. To ensure that this was not a false flag, I also escalated the issue to our Product Team and they manually reviewed your site and confirmed that it is in violation of our Webmaster Guidelines detailed here:

          https://www.bing.com/webmaster/help/webmaster-guidelines-30f....

          We are not able to provide specifics for these types of issues but we recommend that you review our Webmaster Guidelines, especially the section Things to Avoid, and thoroughly check your site for any deliberately or accidentally employed SEO techniques that may have adversely affected your standing in Bing and Bing-powered search results."

          Before snarking, please check that link and the long lists of things - I did not find to find my website https://linmob.net to be offending their "things to avoid list".

          That was a reply to my first ticket requesting re-indexation, later tickets only got what I would call "non-replies".

      • edgyquant 3 years ago

        No we’ve gotten to the point where Bing and DDG need disrupted. The answer is that these companies are ruining things not that DNS and simple search is wrong

        • thewebcount 3 years ago

          I’ve moved from DDG to Kagi.com. It’s paid, but no ads and no incentives to remove legit sites that I’ve seen.

      • posix86 3 years ago

        No the problem here are Bing & co., their algorithm is clearly not good enough.

    • crazygringo 3 years ago

      Various things from addressing the problem directly if possible (block the IP address range they use to scrape your content with, insert JavaScript to strip the content client-side depending on the domain it's being served from), to changing their search engine behavior (canonical meta tags, contact the search engine to let them know, build up links on the web to make your site higher ranked).

      The more sophisticated and popular the copycat site is (scraping from a distributed network, stripping most HTML tags, etc.), the harder it becomes, and the only thing is to contact the search and hope they can manually mark your domain as the authoritative one. Your success may vary according to your popularity/importance.

      • EarlKing 3 years ago

        > strip the content client-side depending on the domain it's being served from

        That would almost certainly be regarded as a "doorway page"[1], resulting in you getting manually ranked downward or even deindexed entirely.

        [1] https://developers.google.com/search/docs/essentials/spam-po...

        • crazygringo 3 years ago

          You may have misunderstood, it doesn't have anything to do with doorway pages which are about presenting extra highly redundant content.

          I'm talking about, if content on legitimatesite.com includes JavaScript that detects if it's being loaded on any other domain, then erase the entire article's HTML from the DOM.

          Obviously this is easily defeated by stripping out JavaScript, so it's useful only for very primitive mirroring.

          • EarlKing 3 years ago

            Hmmm, yeah "doorway page" probably wasn't the term I was looking for. However, that would almost definitely be regarded as some sort of spam or SEO tactic by crawlers and lead to further penalizations, which was my point.

    • gkbrk 3 years ago

      You send a DMCA.

      • pungentcomment 3 years ago

        Chances are that the site will be hosted outside the US and the DMCA is a US only copyright law afaik. I don't think it's applicable outside the US.

        • colejohnson66 3 years ago

          In those cases, you can send one to DDG and Bing, which are in the US. It won’t affect the actual website or other search engines, but it’s better than nothing.

          • emurlin 3 years ago

            Indeed you can. I've had similar issues before (a site scraping content, though apparently from dev.to instead of my own domain and a YouTube channel making a 'video' with text and TTS from a post).

            In the first case, I sent a DMCA to Google & Bing as well as to Cloudflare. Cloudflare responds by giving the name of the actual host, and I sent another DMCA to that host (they were US based, otherwise YMMV). The content was delisted (not the site, even though it was made up entirely of verbatim scraped content) from search engines and from the site.

            Bottom line is you can send a DMCA notice to search engines and it appears to be effective. Actually, in case search engines demote sites like this in some way, I would send the DMCA notice to search engines _first_, because if the content gets removed from the original site they may not be able to verify the duplicate content.

          • bkirkby 3 years ago

            Ianal, but I think ddg could come back and claim that their service doesn't copy the content or that their abstract is fair use, so a dmca wouldn't apply to them.

            If that happens I'd use language about the exclusive "public display" right that you have over your work.

            I'm not aware of this claim being used for a dmca, but I'd like to see how such a claim turned out.

            • colejohnson66 3 years ago

              IANAL either, but the DMCA has been used (by the RIAA, MPAA, and friends) to takedown pirate content on Google. I’d assume the same arguments would apply here. It can’t hurt to try.

              • yrro 3 years ago

                Linking to a site that infringes copyrights may make the linker liable on the grounds that they are committing contributory copyright infringement. It's cheaper for search engines to delist sites than fight that battle--particularly if their corporate masters also rely on licensing media for distribution themselves.

    • StreamBright 3 years ago

      This is exactly a perfect use case for a blockchain. In fact if people are interested we should create a POC.

  • 0cf8612b2e1e 3 years ago

    How do all of those Stack Overflow mirrors stay up if there is a mechanism to pull the copy-cat?

supermatt 3 years ago

This is not a shadow ban. A shadow ban is when it appears to you that you are not banned, but from others perspective you are.

  • tasuki 3 years ago

    Yes. Unfortunately people increasingly use "shadow ban" to just mean "ban", perhaps it sounds cool?

    • dack 3 years ago

      When you first hear a term used, I think it's natural for people just try to figure out what it means in context without looking up the official definition (I've caught myself doing this subconsciously before).

      Imo a logical interpretation of "shadow ban" would be when you are banned but they didn't tell you they banned you, and regular "ban" is when they tell you you were banned. It makes enough sense that people don't think they need to look it up to confirm.

      edit: funny enough, I did double-check the wikipedia page to make sure my understanding was correct, but upon reading further it does acknowledge the expanding of the definition: https://en.wikipedia.org/wiki/Shadow_banning

      • rspoerri 3 years ago

        Shadow banning is when they are doing extra steps to prevent you from realizing that you are banned. Usually giving you the impression of a working service, while everybody else will not be able to see your contributions.

        Banning is just stopping the service for you, wether they tell you actively or not depends ob the service. No search service is actively informing you about the usage of your data, neigther are they telling you they stopped servicing you.

      • edgyquant 3 years ago

        This… this is not a logical interpretation of shadow ban, at best it’s a random guess. Interpretation implies some understanding of the data

        • garfij 3 years ago

          What makes "invisible to everyone but you" a more "logical" interpretation of "shadow ban" than "banned but didn't tell you"? Are shadows invisible to everyone but you?

          • gardenhedge 3 years ago

            When a user is (normal) banned, they are normally told and their access/ability is restricted.

            When a user is shadow banned they are normally not told they are banned and are still able to access and perform functions. I think this secrecy is where the word "shadow" comes in. The user is in the dark about the ban...

          • edgyquant 3 years ago

            I didn’t say they were (though I think so.) I said it wasn’t the more obvious “logical” interpretation.

    • cactusplant7374 3 years ago

      Elon wants to redefine the term as well but for the purpose of a coordinated witch hunt. People like using shadow ban because it sounds more malicious vs. content that is no longer actively promoted by a company. It's hard to claim you're a victim if the reason is you're just not that interesting or popular.

      • ilyt 3 years ago

        There is difference between "content that is no longer actively promoted by a company" and "the company explicitly put your content on a no-show list". Sure, it's not shadow banning but "just" stealth banning (as you're still not informed that you got banned, or reason for it), but banning nonetheless

        • cactusplant7374 3 years ago

          Is it even a ban? Your followers can still see it. People that don't follow that go to your profile can still see it. Elon says no one has the right to "freedom of reach" but never describes this concept as a ban.

          • Jensson 3 years ago

            If you get delisted from search and don't show up on exact match searches then I'd say that they banned you in some way. That is what Twitter does and what Bing did here. We call what Twitter does "shadow ban" since they still show you to yourself, to people subscribing to you and people with a direct link, but nobody else can find you, it isn't a total shadow ban but they are doing something very similar to a shadow ban.

            What shall we call that instead, "bubble ban", since it is like a shadow ban for specific bubbles and for everyone else it is as if that bubble never existed on the site?

            • 3np 3 years ago

              > What shall we call that instead

              Deindexed/delisted. Same as we called it before "shadow ban" was widely known as a term.

              • Jensson 3 years ago

                Delisted would imply that you wouldn't see your listing yourself, or that when you try to post a new listing they would stop you. Shadow banning is when you post to a listing, it says that your listing was successfully posted, but it wasn't really posted.

                Twitter does that, it doesn't tell you that what you post wont be reached by the people you are posting to. Most people don't have many followers, they just reply to tweets and those replies will show up for the original tweeters. Twitter shadow banning you means that the items you post no longer shows up as responses. Sure the small subset that follows you can still see them, but 99.99999% of twitter wont see it, so it is a 99.99999% of a shadow ban.

                If they told you that any of this happened anywhere it wouldn't be a shadow ban.

                • 3np 3 years ago

                  > Delisted would imply that you wouldn't see your listing yourself, or that when you try to post a new listing they would stop you.

                  It does not. Delisting = removed from list. Deindexed: removed from index.

                  Banning implies denying access. There's an important distinction there with Twitter (user authenticates and publishes through Twitter) vs Google Search (indexes public sites). If Google silently stops showing your Google Ads and doesn't inform you, that would be more appropriately described as "shadow banning".

                  There's no "user" or "account" in a web search engine so the term doesn't really fit.

            • michaelmrose 3 years ago

              There is no such thing as a total or less than total shadow ban. If you are visible with anyone with a direct link you aren't shadow banned it

              Bubble ban is also poor verbiage because not appearing in anyone's feed or search results is the default condition not inherently a punishment or redaction.

              A search is inherently a selection process and it's perfectly valid to say some content isn't fit to appear anywhere in a listing.

              I like reddits choice of "quarantined"

              • Jensson 3 years ago

                > I like reddits choice of "quarantined"

                It isn't the same at all, since Reddit tells you about it and it still shows up in search etc. Shadow quarantined, yeah that works. The word "shadow" comes from not telling the user that anything is different, and Twitter doesn't tell you when this happens, it just delists your posts from everywhere except your followers.

                Anyway, it is extremely disingenuous to say that Twitter doesn't shadow ban.

                Consider this scenario: Person A is shadow moderated by Twitter, has a friend B. A replies to B's tweet, but his friend doesn't follow him, they just talk. And now B will never see A's tweet, and A has no idea that B can't see it, and neither do B, both thinks that they can see each other. What is this if not "shadow banned"? Twitter makes these people post things thinking it will be seen, but it wont, wasting their time and potentially hurting their mental health since nobody responds.

                For all intents and purposes this is "shadow banning", when you hurt people like this but say that you absolutely don't shadow ban you are so dishonest that I'd still call it a lie.

              • cactusplant7374 3 years ago

                Reddit also shadow bans new accounts. You won't realize it until you open your user profile link in incognito.

        • bogwog 3 years ago

          That's not a ban at all. Hobby Lobby refusing to sell dildos is a ban, but deciding not to display unpopular merchandise in advertising isn't a ban.

      • extheat 3 years ago

        Well if you are silently placed on a hide list (by manual human intervention without you being able to know without a third party), then by all means that is a shadow ban.

        Edit for clarity.

        • colanderman 3 years ago

          No, a shadow ban specifically is designed to mislead you into thinking you are not banned, to delay you attempting to work around the ban.

          Regardless of whether you were notified, if you can see that you are banned, it is by definition not a shadow ban.

          • rvnx 3 years ago

            In this case, a shadow ban would mean that if the owner of this website searched for his site on Bing/DDG, it would appear normally to him, but would be hidden from everyone else.

        • ilyt 3 years ago

          That would be silent ban (if you're not getting informend that you are, or reason for it), or just plain ban.

          Shadow ban is explicity "pretend to user they are not banned, but don't show it to everyone else"

          Like say getting shadow banned on reddit or HN, you will see your stuff when you're logged in, but anonymous or other people wont.

          Search equivalent would be you getting your own site when you're searching but nobody else does

        • causality0 3 years ago

          "Shadow ban" would be a good term to use for that if it didn't already have a meaning that was different. But it does.

          • extheat 3 years ago

            Hence why I said silently. If I’m posting something and it appears to me as usual on my feed but not others, by all means that’s a shadow ban.

            • InspiredIdiot 3 years ago

              Nope. Still wrong. I understand there is some way it feels like that from your perspective but DDG and Bing don't own your feed. So they are hiding nothing from you. They are 100% ~up-front~ (edit) consistent about not choosing to show your site (ban it) and the fact that they don't control your feed doesn't make it any more accurate to apply the word "shadow" to their ban.

              • InspiredIdiot 3 years ago

                I have become what I hate most. Someone who didn't read the article. I've heard too many instances of the term being misapplied and jumped to a conclusion based on the discussion.

                If you use Bing webmaster tools (a logged-in account for the use of the domain owner/content creator) and you can see indications that Bing indexed the content and no indications that errors preventing it from being eligible to show then it is certainly at least closer to a shadow ban than I originally thought.

                Still, any reference to a "feed" entirely misses the point unless Bing is the one also serving that feed. I can't see any evidence that Bing displays a feed to the poster.

      • paulcole 3 years ago

        He’s kinda Trump-esque in his willingness to just redefine words (and basically pretend the original use never existed) to his own benefit and have his followers jump on board. Another great Elon example of this is Tesla’s “Autopilot.”

        • jimmygrapes 3 years ago

          I really don't think the willingness to refine words willy-nilly is a behavior I would associate first with Trump or Elon. Using the wrong word and insisting it means the same thing, maybe, but certainly not redefining words.

        • philippejara 3 years ago

          Trump-esque? what words has trump even done that for? Changing the meaning of words to fit what you want has been happening long before the printing press even was invented by people in power, vying for power, reporting on things they have a biased interest in or just wanting a quick win on an argument. Not everything needs to be a trump analogy.

          • paulcole 3 years ago

            Yeah I guess you’re right here. It’s more like they both go beyond ignoring the real world and just say whatever nonsense they feel like. Then they have a bunch of simps who won’t even question it.

          • ceejayoz 3 years ago

            Trump popularized the term “fake news”, generally meaning “real news I don’t like”.

            • andrewflnr 3 years ago

              I only ever see it used either (a) sarcastically or (b) to describe actual falsehood. I don't think he was successful in redefining it.

            • lokar 3 years ago

              And whatever covfefe is

            • ryandrake 3 years ago

              People forget: for a very, very brief moment during the 2016 US presidential campaign, the problem of actual "fake news" was in the spotlight. Literal fake news, where a pseudo-news site would put up stories that was entirely fictional, to promote a political POV, and present them as news. Trump swiftly neutralized the term by co-opting it and using it, as you say, to mean "real news I don't like". But for just a moment in time, it was originally used to describe actual fiction posing as news.

  • rapnie 3 years ago

    Is there also a term for being algorithmically suppressed on a social media platform, I wonder? I.e. a much more subtle, harder to dectect mechanism whereby the algorithms ensure you get some exposure, but never the same as other unsuppressed people would get based on similar activity. Or only exposure to a certain limited subset of the graph based on some metrics (e.g. just your 'friends' so no one points out you are effectively shadow banned).

    • Marazan 3 years ago

      Yes, its called Algorithmic Supression.

      That doesn't sound as cool and victimey though.

      • rapnie 3 years ago

        Could call it a RoboBlock in popular language, with reference to a roadblock. An obstruction to engage enforced by our machine overlords. Or better maybe, a RoboGag.

      • spoiler 3 years ago

        How about Algorithmic Oppression

    • colanderman 3 years ago

      "Automatically downranked/downweighted/penalized" are terms I've heard.

    • hutzlibu 3 years ago

      I think unlike in the case here, (soft) shadow banning would be appropriate to describe it, even though not 100% technical correct.

    • cma 3 years ago

      I believe Musk when condemning old twitter for doing it and deboosting when he promises new Twitter will do it.

  • badrabbit 3 years ago

    That is what is happening, you think you are not showing up because bad SEO or better results. You have to find out through experimentatiom that you are restricted. The moderator didn't let you know that they have taken punitive action against you.

  • seanhunter 3 years ago

    Indeed. Some people seem to be trying to make it apply any time the top result of an algorithmic ranking isn't what they think it should be (eg if they have been deboosted rather than shadow banned).

  • JadoJodo 3 years ago

    I suspect OP used it in the way that “I believe all my posts are showing up in the places that they should, but unbeknownst to me, they are being suppressed.” In this instance, I could see a “search engine shadowban” being an appropriate moniker.

    • InspiredIdiot 3 years ago

      I think a good test of whether this application of the term makes any sense is: Could any search engine ban ever not be a shadow ban? We already have a term: ban. Let's just use that one and stop conflating things and being unnecessary imprecise and incendiary. It helps certain parties' (edit plural possessive) agenda but does not help us clearly communicate.

  • ffhhj 3 years ago

    The author is extending the concept to include an inadverted ban. Why would Bing warn him anyways, since there is no user account? Welcome to Cancelbannia.

  • Dylan16807 3 years ago

    I think being banned from search could be part of a shadow ban, but when the entire service is search that's just a ban.

    • mkl 3 years ago

      If it was a shadow ban he would see his site when he searched but we wouldn't. This is definitely not that.

      • Dylan16807 3 years ago

        Are you talking about normal sites? Not search engines? If so I don't really agree. I don't think you have to go through that much effort as part of implementing a shadow ban. If posts show up to the user in most places, then good enough it's a shadow ban.

        Only because this is a search engine do I say this solidly isn't a shadow ban.

        • mkl 3 years ago

          Well, I don't understand the distinction you're making, as search doesn't seem special here to me. The key feature that makes something a shadow ban is that everything looks normal to the shadow-banned person, but the rest of us don't see their stuff or see it in some restricted way. That seems like it would apply equally to search, forum comments, etc.

  • travisgriggs 3 years ago

    Clearly we need a new term. I nominate

    Ghost Banned

    or

    Ghostdexed

donatj 3 years ago

This triggered me to DuckDuckGo my own site and immediately I notice the top result is someone rehosting my OSS on a page loaded with pages of crap SEO content.

Scrolling further, I don’t seem to find my own site either… https://donatstudios.com

I’ve added my site into Bing webmaster tools, we’ll see if it helps I guess.

  • HomeDeLaPot 3 years ago

    Wow. I think I've used your circle generator on that other site without realizing it. That sucks. I've bookmarked yours now!

    • donatj 3 years ago

      A bunch of sites popped up hosting it loaded with SEO garbage and ads. I’d licensed it MIT so while they certainly can, it sure doesn’t feel nice.

      • esperent 3 years ago

        I've writing and freely shared a lot of OSS code and I plan to continue doing that. All MIT licensed.

        But I keep thinking that this is a limitation of MIT. It's written with the purest of good intentions but without any way to prevent bad actors from exploiting that.

        I wonder if there's a better kind of license that provides a close level of freedom but would prevent some of the most obvious exploitations (e.g. packaging and selling the code, or reposting on a site with advertising)?

        There's copyleft licensing like LGPL but perhaps that goes too far in the other direction and besides, it seems to be very unpopular amongst web developers so I'm afraid if I release any LGPL code it won't get used much or attract contributors.

        Is there a happy medium between these?

        • PeterisP 3 years ago

          Every open source license will by definition allow packaging and selling the code or reposting it on a site with advertising, permitting to do that is one of the key freedoms; if you're forbidden to use the code for commercial purposes, then that's not an open source license. See https://opensource.org/osd

          MIT will allow that, Apache/BSD/Mozilla licenses will allow that, GPL and LGPL will allow that, Creative commons CC-BY and CC-BY-SA will allow that - the only difference is extra conditions e.g. GPL will require the seller/redistributor of the code to keep the same license, CC-BY will require leaving attribution to the author, etc, but all of them will allow someone else to redistribute the code for commercial purposes.

        • bcrosby95 3 years ago

          I'm not a lawyer, but my understanding of LGPL is that it wouldn't prevent people from doing what they're doing in this scenario.

        • bo1024 3 years ago

          One of the Creative Commons licenses like cc-by?

      • tinus_hn 3 years ago

        Same thing happened with StackExchange and Wikipedia, but they appear to have fixed it.

mtlynch 3 years ago

>One “out there” reason I can think is that I use Amazon Affiliate links on my Bookshelf and my /Uses page and that triggers a shadow ban?

It's probably not the reason, but it's worth noting that the author is using Amazon affiliate links in violation of Amazon and FTC rules because they're not disclosing the fact that they profit from purchases through their links.

Per Amazon:

>Anytime you share an affiliate link, it's important to disclose that to your audience... you must (1) include a legally compliant disclosure with your links and (2) identify yourself on your Site as an Amazon Associate with the language required by the Operating Agreement.

https://affiliate-program.amazon.com/help/node/topic/GHQNZAU...

Per FTC:

>As for where to place a disclosure, the guiding principle is that it has to be clear and conspicuous... Consumers should be able to notice the disclosure easily. They shouldn’t have to hunt for it.

https://www.ftc.gov/business-guidance/resources/ftcs-endorse...

mananaysiempre 3 years ago

For some reason, Beej’s Guide to C Programming is also banned from Bing (and consequently DDG) [1], with the standard robotic non-explanations given when the author asked, even though the rest of the site is not.

[1] https://beej.us/guide/bgc/whynoddg.html

  • Pelam 3 years ago

    I can find Beej.us with DDG, but not daverupert.com. Maybe Beej got the problem resolved somehow.

    • mananaysiempre 3 years ago

      He says in the link it’s specifically the C guide, the rest of the website is fine. Though... yeah, DDG queries like “beej c guide strlen” give reasonable results for me, if with an unjustifiably high-ranked position for the mirror at http://docs.hfbk.net/beej.us. Bing ones only include the mirror and the other guides (and a Scribd-hosted PDF copy, of all things, as the first result below a huge navigation card referring to https://beej.us/guides but without the C guide among the links).

      • projektfu 3 years ago

        Incidentally, what is the "legal status" of Scribd hosting a partial preview of works that they tell you are BY-NC (attribution/non-commercial use only) and telling you to become a member to be able to view the whole thing? Is that not a commercial use?

      • eps 3 years ago

        https://beej.us/guide/bgclr/ is the 2nd result for "beej's c", with first being site's homepage.

        • beej71 3 years ago

          Ah, I was just wondering if splitting the book in two volumes (900 pages is more than Amazon can print!) would impact this.

          But it still doesn't index the first volume...?

          Thanks for the info.

pseudolus 3 years ago

You've indicated that you've used Bing's tools to see if your website has been indexed but are silent as to whether you've actually manually submitted your site to be indexed by Bing using their url submission tool [0]. If you do submit the URL and then, after a decent interval, your site still doesn't show up then there might be something to your claim.

[0] https://www.bing.com/webmasters/help/url-submission-62f2860b

  • Liquix 3 years ago

    To be fair, a decade-old SFW blog with 2.2k crawler hits ought to be automatically indexed by any major search engine.

  • beej71 3 years ago

    I'm not the OP, but I have a site with the same issue, and I did manually submit. No impact.

lapcat 3 years ago

See "Bing and DuckDuckGo removed my business web site" https://lapcatsoftware.com/articles/bing.html and "My website disappeared from Bing and DuckDuckGo, Part 2" https://www.jessesquires.com/blog/2022/07/25/my-website-disa...

[EDIT] I just published a new blog post "Bing and DuckDuckGo removed my business web site AGAIN" https://lapcatsoftware.com/articles/bing2.html

Sigh.

mg 3 years ago

Looking around his page, I see some interesting aspects on his atom page:

https://daverupert.com/atom.xml

First, he sends it with a "content-type: application/xml" header. In contrast to most sites that send it with "content-type: application/atom+xml". Which seems to have the nice effect that it renders in Firefox instead of opening the usual "What should Firefox do with this file?" popup.

Secondly, he provides this nice header text "Yahaha, you found me! This is my RSS feed.". It seems to be fetched via this part of the code:

<?xml-stylesheet href="/pretty-feed-v3.xsl" type="text/xsl"?>

Pretty nice. Are those best practices? Or will "content-type: application/xml" mess with users who have a native feed reader installed and expect the reader to kick in when they click on a feed url?

  • rzzzt 3 years ago

    XSLT is a template language where you match snippets of XML using XPath expressions and output... anything else, in this case HTML using templates that can make use of the attributes, inner content, etc. of the captured XML portion.

    "An elegant weapon for a more... civilized age."

    • Semaphor 3 years ago

      > "An elegant weapon for a more... civilized age."

      One of the ideas people had back then was having a product catalog in XML that your tool would give you, and that you could upload to your website and display as a nice website via XSLT.

      It was such a fascinating technology when I learned about it, but I’m not sure if it ever saw any serious use? Maybe in enterprise?

      • baald 3 years ago

        Well, I'm having the ... pleasure ... of learning it now, as much of CheetahMail's templating system uses it (at least as initially configured by their professional services team. The company I work for just switched over and had them do the initial implementation). Just a data point.

        • Semaphor 3 years ago

          We are using a mail system that uses a special version of VbScript. Not sure you’d want to trade ;)

      • antod 3 years ago

        It had some use. I'm pretty sure IE used to support client side transformations when rendering XML that linked an XSLT file. I know one SaaS used to do that before they rearchitected their front end.

        But XSLT was not fun work with.

      • jbotdev 3 years ago

        It’s still used by the SEC to render company filings on their EDGAR site. I can’t find the exact files/docs, but I’ve rendered filings like this using the XML and XSLT they provide: https://www.sec.gov/ix?doc=/Archives/edgar/data/320193/00003...

      • mileza 3 years ago

        I don't know if that's related to your question, but I've used XSLT templates with Apache FOP before to render PDF documents.

    • leni536 3 years ago

      I would never call XSLT elegant, it is horrible IMO.

      • masklinn 3 years ago

        XSLT may well be the worst possible implementation of an elegant underlying concept.

marginalia_nu 3 years ago

I got you, fam.

https://search.marginalia.nu/site/daverupert.com

sct202 3 years ago

You can submit a ticket to bing via their webmaster tools website. I've done it before in the past and a real human did respond at the time. In my experience bing will straight deindex full websites for unknown reasons while Google will ranking penalize but leave you searchable if their algo feels you deserve it.

  • linmob 3 years ago

    Did you manage to get re-indexed? I did not, and after the first attempt, replies to my tickets became quite rare.

henriquez 3 years ago

Did you perhaps abbreviate Microsoft as M$ (with a dollar sign)-? That really pisses them off.

ducklingquack 3 years ago

Had the same problem due to negative SEO campaigns by naughty competitors. Wrote to Bing Webmaster Tools Support team (https://www.bing.com/webmasters/help/webmaster-support-24ab5...) and after a lengthy process got a response that ”the issue” had been addressed.

It’s been a few months since and my website is indeed back in the search results so I advice whoever is having this problem to reach out to Bing.

vcg3rd 3 years ago

Your site doesn't get a hit, but it's referenced enough to find it. One hit was https://www.seoaudit365.com/domain/daverupert.com

It says your IP doesn't direct to your site. I wonder if that's the problem.

  • tambre 3 years ago

    It would be quite surprising if reverse records were the reason. The vast majority of sites certainly don't have them pointing back. Impossible to do with many providers and probably basically all CDNs.

Joel_Mckay 3 years ago

Negative SEO scammers use intentional search-policy violations to push down the rank of perceived competitors for a few weeks.

While it is more likely the poster will get a few people to check on the situation and naively drive up page rank... a personal site is just a rounding error for traffic in a long-tail distribution known as the modern web.

Most search engines will correlate user-side telemetry traffic against crawler and web stats. i.e. if the bots tend to prefer your site for abnormal reasons, the ranking algorithm may blacklist a signature, domain, and IP sets for several weeks as punishment.

Note too, it is still common for a human employee to manually check a suddenly popular site that pops up out of obscurity. i.e. this catches the more sophisticated cheats, and may have legal repercussions in severe cases.

In summary, if you mess with modern search engines, than expect the ban hammer to fall eventually. ;)

TomK32 3 years ago

The W3 HTML Validator is complaining about a few things, not sure which one could make bing think the website is not worth indexing, but it's worth a shot fixing those issues. https://validator.w3.org/nu/?doc=https%3A%2F%2Fdaverupert.co...

omgmajk 3 years ago

At the bottom of the page there is a comment that says "Some results have been removed", but unlike Google you can't see them. Would be interesting to know if the domain is among the removed results or if the site has not just been indexed yet.

dazc 3 years ago

Sites get de-indexed from DDG and Bing all the time because negative SEO attacks (intended for your Google traffic) have a devastating effect.

Why, Bing does not totally ignore bad links.

The good news, all sites I've seen affected by this recover after a few weeks.

badrabbit 3 years ago

HN taught me first hand on how horrible shadow banning is. You all tolerate this here so it's mighty hypocritical of you to criticize Bing.

I'll say it again: It comes down to how you treat people. Treat others the way you want to be treated. No one wants to be shadowbanned and we can all agree it is a decidedly cowardly and cruel thing to do.

And you can't use "quality" or anything short of being coerced as an excuse. Techniques and technologies to moderate people without shadowmodding at scale are mot just there but very well established. A site for technologists has no excuse to shadowmod other than elitism amorality.

  • maxbond 3 years ago

    HN's "shadow bans" aren't hidden and allow you to keep posting. Isn't that more tolerant than most other forms of banning?

    I have showdead turned on. I see fresh accounts that are automatically dead on each comment which are legit contributions, probably because they're using Tor or a widely abused VPN; that's the only common miscarriage of HN moderation I regularly see, and I vouch for house comments. Those accounts should be in the clear after a week or something like that. I have a couple comments I feel shouldn't be dead, but I can see how others would feel differently, and I have I believe 3 dead comments out of >3000 (many of which expressed views others vocally disagreed with, and I generally feel my views are not particularly popular on HN). But most of the dead comments I see are obviously harmful to discussion. The last time I saw hate speech from a banned HN account - was earlier today. What is it I'm missing here?

    It's all well and good to say, treat others as you'd like to be treated. But I don't want to be harassed either. So I forgo harassing people sure. But what's to be done about the people harassing me?

    Are you perhaps unaware of a phenomenon called the paradox of tolerance where, if you extend universal tolerance to everyone, including those who use their speech to silence others (through threats, harassment, shouting over people, poisoning the well, etc), you still end up with a forum in which not everyone can share their ideas?

    • badrabbit 3 years ago

      Dead comments are not shadowbanned, just banned. I wasn't referring to that. And the problem with shadowmoderation is it happens without attempting to tell the person to change behavior so people who will cooperate don't get a chance to do that. If someone does not cooperate and acts with bad or false intentions you should ban them explicitly. I never suggested universal tolerance I only suggested transparent moderation.

      • maxbond 3 years ago

        The only people I've seen get banned without warning are transparent trolls (usually people who make an account to make each comment, with that comment being trolling or hate speech, because they understand that that will be banned). People who aren't purely trolling do seem to get a public message from dang. Does your experience differ? How?

  • badrabbit 3 years ago

    Look at my post history and karma lol. I get exactly 3 karma and it gets hidden. I should do a line chart and post it somewhere outside HN lol.

RandomWorker 3 years ago

https://daverupert.com/robots.txt

Maybe change this? Simply add:

User-agent: * Disallow:

To allow all crawlers to the site.

kordlessagain 3 years ago

No mention of robots.txt on the post, nor here in the comments.

After a bit of poking around, it would appear Bing may need an allow block to crawl. I don't know what DDG does, but the author's site effectively has nothing in the robots.txt file, other than a commented out Disallow block. From doing this before in the past, I suggest include the following:

  User-agent: *
  Allow: /
Twirrim 3 years ago

Okay, so this prompted me to look at Big Webmaster tools. It looks like they recognise my sitemap file, but haven't bothered to index it in nearly a decade. "Last processed: 6/6/2014". I know I don't post new content that frequently but... WTF?

Anon4Now 3 years ago

The background image doesn't render on the homepage in Firefox. This makes the blog links appear as very light blue text on a white background. Makes me wonder how it renders for the crawler and whether it's getting flagged for invisible content.

jasmer 3 years ago

I wonder if we need basic regs given that search is a public good, including best practices such as SE transparency. If they search you, they have to say something about the parameters and the results etc..

fruit2020 3 years ago

Tangential question to the site index. How does one get the TLD and ccTLD zone data, apparently it’s not that open. There are some ccTLDs which give this info, for example .ch.

cramjabsyn 3 years ago

It’s not a shadow ban if you can easily see that you’ve been banned.

The point of a shadow ban is for the banned user to not notice.

This sounds like a regular old ban/blocklist.

Macha 3 years ago

It's not a missing meta description at least, the same is true for my site and I'm still ranking relatively highly for some keywords on DDG.

obarthelemy 3 years ago

"Southern gentleman" conjures up the image of a white-suited Kevin Spacey in "Midnight in the Garden of Good and Evil". Not sure it makes for a strong claim to respectability/morality, both fictionally and IRL ;-p

imwillofficial 3 years ago

This article was a waste of time. Nothing was learned of value.

svnpenn 3 years ago

numbers look like garbage, I think this is the culprit:

    font-family: system-ui, 'Noto Emoji', sans-serif;
teekert 3 years ago

M$ is such a black box. At some point getting on their spam list, for no reason and still I’m not 100% certain, is the reason I stopped self hosting email.

  • reaperducer 3 years ago

    Blame SpamCop.

    I recently discovered that one of my services is on stage MS email naughty list, and found out it's because MS uses SpamCop.

    I contacted SpamCop and it was very responsive. Unfortunately, the solution SpamCop suggested was to move the entire project to a different provider.

sergiomattei 3 years ago

Perhaps you haven’t been indexed yet?

  • freitasm 3 years ago

    The blog seems to be going since at least 2009. Long enough to be indexed.

    Perhaps some server-side filtering going on, blocking the bingbot user-agent?

shirback 3 years ago

Microsoft thinks you're Dave Rubin because.. well, it's Microsoft.

Hani1337 3 years ago

duckduckgo is a joke. use gibiru

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection