Blogwashing

198 points by sessy 6 years ago · 116 comments

Reader

Honestly, I feel like this is Google's fault. It shouldn't be prioritising new content as much as it does. The best articles are often "classics" from several years ago that haven't changed because they haven't needed to.

hinkley 6 years ago

I run into a similar problem trying to chose libraries for projects.
There's one project that is 'active'. Are they active because they're better, or because they arrived late to the party and are reinventing the wheel.
This other project is 8 years old and has hardly been touched in 2 years. Is that because it takes 4 years to cover the entire domain and a couple years to fix all the bugs that can be fixed? Or because it's languishing?
Novelty is not a good selection criterion. Sometimes the old ways are best.
- stanislavb 6 years ago
  
  You can have a look on LibHunt next time. Apart from project activity, you can compare different libs by their relevant popularity. That should help you make a better decision. E.g. https://ruby.libhunt.com/categories/21-a-b-testing
  (disclosure - I've built libhunt)
endorphone 6 years ago

Or they didn't change because the author moved on to other things. So their article is full of obsolete information, comparisons are no longer relevant, old APIs, invalid syntax, incorrect library assumptions, etc.
When I search for anything technical the first thing I do is change the date range to the past year.
- jagged-chisel 6 years ago
  
  > ... full of obsolete information ...
  That would make it not "one of the best," right?
  - CiPHPerCoder 6 years ago
    
    I'm sure the best PHP 4 article will be of no help to someone writing code targeting PHP 7.3 today.
swalsh 6 years ago

When I'm looking for a book, I usually look for older but still popular books. New but highly rated books are hit or miss, but those old ones that people still recommend have consistently been great reads.
- code_duck 6 years ago
  
  I have missed so much media over the past 20 years that I operate like this. I'm not concerned with whether music or games are new or not. Media that came out 5 or 15 years ago is just as new to me as something that came out this year, and there's a much larger selection.
  - mikepurvis 6 years ago
    
    There's a whole community of video gamers who do this: https://www.reddit.com/r/patientgamers/
    Cost is the most obvious motivator, but for me it's more about filtering for quality in order to best allocate my limited adult gaming hours— why sink the time and money into a just-released title that might turn out to be the next Anthem or Fallout 76, when you can wait a few years about pick it up for $10-15 once it's a known winner?
    That said, even within r/patientgamers, there's a spectrum in terms of just how patient some people are. For example, some people are just getting into stuff from the Xbox 360 era, whereas others are already talking about whether RDR2 has stood the test of time or not, ha.
    
    johnchristopher 6 years ago
    
    There are drawbacks though https://xkcd.com/606/ if your friends are gamers ^^.
    And doesn't work as well with mmorpgs or online games.
    
    code_duck 6 years ago
    
    I played Half Life 2 for the first time in 2010. True that consuming media late changes opportunities to be part of social groups about it.
- dredmorbius 6 years ago
  
  No greater mistake can be made than to imagine that what has been written latest is always the more correct; that what is written later on is an improvement on what was written previously; and that every change means progress. Men who think and have correct judgment, and people who treat their subject earnestly, are all exceptions only. Vermin is the rule everywhere in the world: it is always at hand and busily engaged in trying to improve in its own way upon the mature deliberations of the thinkers. So that if a man wishes to improve himself in any subject he must guard against immediately seizing the newest books written upon it, in the assumption that science is always advancing and that the older books have been made use of in the compiling of the new. They have, it is true, been used; but how? The writer often does not thoroughly understand the old books; he will, at the same time, not use their exact words, so that the result is he spoils and bungles what has been said in a much better and clearer way by the old writers; since they wrote from their own lively knowledge of the subject. He often leaves out the best things they have written, their most striking elucidations of the matter, their happiest remarks, because he does not recognise their value or feel how pregnant they are. It is only what is stupid and shallow that appeals to him. An old and excellent book is frequently shelved for new and bad ones; which, written for the sake of money, wear a pretentious air and are much eulogised by the authors’ friends. In science, a man who wishes to distinguish himself brings something new to market; this frequently consists in his denouncing some principle that has been previously held as correct, so that he may establish a wrong one of his own. Sometimes his attempt is successful for a short time, when a return is made to the old and correct doctrine. These innovators are serious about nothing else in the world than their own priceless person, and it is this that they wish to make its mark. They bring this quickly about by beginning a paradox; the sterility of their own heads suggests their taking the path of negation; and truths that have long been recognised are now denied — for instance, the vital power, the sympathetic nervous system, generatio equivoca, Bichat’s distinction between the working of the passions and the working of intelligence, or they return to crass atomism, etc., etc. Hence the course of science is often retrogressive.
  -- Arthur Schopenhauer, "On Authorship and Style"
  https://ebooks.adelaide.edu.au/s/schopenhauer/arthur/essays/...
  Emphasis, and wall of text, in the original.
  - Enginerrrd 6 years ago
    
    >No greater mistake can be made than to imagine that what has been written latest is always the more correct; that what is written later on is an improvement on what was written previously
    I feel this way about math books a lot actually, especially calculus. When we really formalized the foundations of calculus using limits and epsilon delta arguments in proofs (instead of alternatives like Non-standard analysis see: https://en.wikipedia.org/wiki/Non-standard_analysis) I feel like a great deal of the intuition that actually built calculus was lost. I call it "differential reasoning", and I wish there was a class in just such a thing, formalized with operator theory and non-standard analysis techniques. It's still used a lot in physics but no one talks about the concept of a "ratio of differentials" in modern calculus.
    
    ssivark 6 years ago
    
    Yeah, a book recommendation would be great! I’ve picked up this way of thinking over the last decade, but don’t know of a resource to recommend people.
    
    dredmorbius 6 years ago
    
    In the case of calculus, do you recommend Newton, Leibnitz, or others?
- mlthoughts2018 6 years ago
  
  I find this is especially true of fiction or infotainment books. Older fiction and popularized non-fiction usually deals with many of the same social issues faced today, but because it has stood up to the test of time, often more effectively.
  Whether it’s gender identity issues, technological warfare, jingoist politicians, class struggles, leadership advice, etc., somebody already wrote about it more effectively than the huge majority of modern writers churning out books.
hombre_fatal 6 years ago

I don't think your comment follows from the evidence presented by the author.
For example, the blog didn't even beat out the result above it which is "older content". For all we know, the blog was the last relevant candidate for that search query, the author doesn't demonstrate otherwise.
The only spammy thing we can tell from TFA is that the blog post's publish-date is set in 2019 when it was clearly published in 2016. But we don't know if Google knows that and is already penalizing them. Nor if it's an effective attack.
Also, the author doesn't demonstrate that the modified-date is inaccurate or spammy. CMS software will update this if you go back and correct a typo, or somehow mass-transform a bunch of blog posts. For instance, I once had a dead-link-finder plugin on Wordpress that gave me a UI to patch urls in old blog posts that would have updated the modified-date on all affected posts.
I don't see anything in TFA that suggests that the blog was benefiting from spammy behavior.
puranjay 6 years ago

I work in content marketing so SEO is a big part of my work focus.
Google's algorithm tends to classify queries as "evergreen" and otherwise. With the former, recency doesn't seem to matter that much. I've had content stick a page 1 ranking for 3+ years without any updates.
For non-evergreen keywords, recency matters a lot. A fresh article on an established domain can often outrank better, but older content.
The query classification system works well enough for most keywords, but there is a grey area where Google doesn't really know what to focus on. Like with a query that focuses on "best practices". It isn't clear whether Google should prioritize classic, evergreen best practices, or focus on more recently developed best practices.
I would definitely appreciate a filter that only shows me old school content for my target queries. Some of the best content I've read online sits on websites that haven't been updated since 2002 and still use tables based design.
- msla 6 years ago
  
  https://yarchive.net/
  This website is a curated archive of Usenet posts.
  It's far from universal, but if you understand quality content, you will recognize it here:
  https://yarchive.net/comp/risc_definition.html John Mashey talking about RISC vs CISC.
  https://yarchive.net/comp/linux/lost+found.html Ted Ts'o on lost+found.
  ... and plenty more.
- papln 6 years ago
  
  3+ years isn't a very long time. There is a lot of good old scholarship.
patrec 6 years ago

You want to read a "classical" comparison of SES vs SendGrid? Really?
djsumdog 6 years ago

There was a HN post a few weeks ago showing how Duck Duck Go was better at showing old content. I totally agree. We like what's fresh and new, but next time you go to a blog and read some tech article; look through their archive and see what else they've written. If you run your own blog, keep an easily indexd archive page. Mine goes back to 2007: https://battlepenguin.com/archives/
- sdegutis 6 years ago
  
  I had a blog for a decade but I realized that I would post something and nobody would read it, but then a famous programmer would post the same thing in their own blog a year later and everyone would hear about it. So I concluded that blogs are only for the well known programmers and that it’s a waste of my time to share my thoughts with “the web” as a general entity. Instead I share my thoughts with individuals when requested (which mostly just means answering my kids’ questions). Well, also commenting here but I’m not convinced that’s been beneficial to anyone either so I’m seriously considering giving up that too.
  - drakonka 6 years ago
    
    I blog to share my thoughts with myself, but in public. I blog with the intent of consolidating what I'd just learned or experienced in my mind. I've had some popular posts inbetween years of content that nobody cares about, and sometimes I'll have a reader email me about an old post I wrote and describe how it's helped them in a recent project. Or sometimes someone will comment about how they just saw a post mentioned in a conference slide, and they're commenting on it _from_ the conference. Or I'll notice traffic coming in from an online newsletter about some programming topic. Those emails and brief spikes make me feel nice. The occasional acknowledgement from a reader is like a bonus to what's essentially a private diary I don't expect anyone to take interest in.
    
    sdegutis 6 years ago
    
    I get that idea and it’s not new: one blog said in its subtitle “letting google index my thoughts”. But it’s the equivalent of talking out loud. Sure someone might hear you accidentally and benefit but most of the time it should be kept to ourselves. If something should be said, it should probably be written in a book. That will set at least some bar to prevent the constant stream of noise on the internet. Everyone complains about the quality of content going downhill and the signal to noise ratio being unbalanced but the solution is for all of us to consider that we’re probably generating a lot more noise than signal. So I for one will quit blogging and comment a lot more rarely and if I have something more important to say I’ll write a book. Getting published isn’t infallible of a mark of signal but it’s better than git push.
    
    drakonka 6 years ago
    
    Most of the kinds of posts I get positive feedback on aren't something I'd ever consider interesting or important enough for a book. Personally I don't have a problem with blog noise/"talking out loud" on the internet. I think it's more up to the people who don't want noise to decide what noise is for them and to implement their own filter strategies. I can't even count how many times I stumbled across some blog post casually written that the author likely didn't expect to be noticed at all, but which provided some useful or just plain interesting information to me. It wouldn't make it into a book, but I'm glad it made it into this person's little corner of the Internet for me to discover.
    
    saagarjha 6 years ago
    
    Many of my blog posts were written mostly for this reason and I get occasional emails from people who find my thoughts helpful.
  - dredmorbius 6 years ago
    
    My own blogging has served, variously:
    - As a public record of the evolution of my ideas and thinking. (Often embarassing.)
    - As a reference for things I've found useful, and contextualised.
    - A place I can post my highly original thoughts ... to be told "oh, X came up with that years / decades / centuries / millennia ago." Happens far more often than I'd have ever dreamed.
    - As a place I can refer to / link to my own best efforts at explaining some idea or principle. Rewriting from scratch continuously is tedious.
    - Rarely, for discussion.
    - As a very loose bookmarking service. (Only a small fraction of references end up there, but the ones which do tend to be significant.)
    - Something of a shingle, though I've not leveraged that as yet.
    Every so often a particular post will take off, and it's generally exciting when it does. Rarely the pieces I pour my sweat and tears into, though that happens occasionally. What the World decides to Take an Interest in is a wonderously fickle phenomenon.
    There's also the tinkering on the blog itself as a technical means of presenting, distributing, and organising information, which I find interesting.
  - papln 6 years ago
    
    Blogging is also for your own benefit, and to have a history of your ideas. You can also post your bog articles to HN, Reddit, etc. And blogging like much else is a bit of a marketplace / ecosystem. We don't need you are blogging on generic popular topics that anyone can and does write about, we need you to blog about niche topics that aren't being covered.
  - disneysockputty 6 years ago
    
    Smart person. I've had the same thoughts. HN is often fun and occasionally interesting, though, so I keep engaging.
bsder 6 years ago

> Honestly, I feel like this is Google's fault. It shouldn't be prioritising new content as much as it does.
It is Google's fault, but the fault is that I can't choose.
If I'm looking for something about Beaglebone programming, there is a 90%+ probability that I need it limited to "last 6 months". If I'm looking for something about java.util.concurrent, 10 years old articles are probably just fine.
The fact that I can't make this choice is infuriating.
cortesoft 6 years ago

It depends on what you are looking for. I was having issues after an OS update and was looking for help, and ended up finding all sorts of out of date info for previous updates. Had to look for more recent posts.
swixmix 6 years ago

When it comes to blogs, prioritizing new over old is generally a good idea.
If a blog post is worth publishing multiple times then it should be published as something different.
I think Google ranks microblogs, blogs, articles, wikis, discussions, books, etc differently.
pmlnr 6 years ago

It is Google's fault. So were those when you could rank no.1 for a term without having the term on the page because the links pointing at the site had them inside the <a href=""></a>.
- tacon 6 years ago
  
  The classic case is the Google search for "news". CNN used to come up first, now it is second. For the longest time, the word "news" did not appear on the CNN home page. At the moment, I see four uses of the word "news", not counting menu items.
C14L 6 years ago

But what is "often" in that context?
ThrowAway54769 6 years ago

How is this Googles fault? They’re just following the market. People want “modern” information. They don’t want old stuff from two years ago.
- yowlingcat 6 years ago
  
  What people say or indicate they want and what people need can be two very different things. The long term potential backlash from solving for the former should be weighed against the short term gains, but that's tricky!
- hombre_fatal 6 years ago
  
  Well, you'd expect Google to have some defenses against this sort of attack the same way they penalize keyword stuffing and cloaking.
  But for all we know, they do. The TFA doesn't establish otherwise. It doesn't even establish that the author didn't update the content in June. Or that the blog unfairly ranked over better candidates.
- disneysockputty 6 years ago
  
  Do people really want content from SEO spammers, though?

geocrasher 6 years ago

If I wrote an article in 2015 and freshen it up with more relevant links, better writing, or some other thing, is it still published in 2015? Should I re-date it? Am I "gaming" the system by doing that? Or am I notifying my readers (all 3 of them!) that there is fresh content?

This is actually a real question from me, and I'd love feedback. I look back on my stuff from a few years ago and the blog posts need fixing up, be it for SEO, for my more up to date 'voice' or because I switched to Gutenberg and removed janky slider plugins that haven't been updated since 2014. Should I re-date? Or leave them as-is?

markosaric 6 years ago

If you've updated the post to make it more relevant, better, include more recent info... then it certainly makes sense to inform the visitors about it. Either just change the published date to the date you updated it or list the "last updated" date somewhere on the post too.
I know some sites abuse this because of the way that the Google algorithm works, but it is a good practice to list the date you last updated the post so your visitors can see.
For specific topics, I prefer to read more recent info so if I end up on a page that lists "2015" as published date but doesn't tell me that it has been updated since, I may not trust that page as much simply because of the idea I may have that the info may be outdated.
dorfsmay 6 years ago

Often dates are extremely relevant, for example a 10 year old hardware review won't be as relevant as a new one. On the other hand, blogs don:t have to be static, some programing technique might still be relevant, but need to be updated for a new version of the language.
Solution: put a "first published" and a "last updated" date, or use a wiki that makes the revisions available online and a way to link to the different revisions.
- mroche 6 years ago
  
  A variant of that, but something I’d like to see is listing initial publication and current revision/update status (including initial). Then at the bottom of the post put a summary of each update so readers know how the post has evolved.
- disneysockputty 6 years ago
  
  Did you read TFA? "PUBLISHED TIME", "UPDATED TIME".
  The issue is that blogs abuse these fields for SEO purposes.
thrower123 6 years ago

Please, for the love of all that is holy, publish new information as a new post, don't silently update old content. You never know when some obscure bit that is contained in the old post that appears to be superseded by new material will turn out to be invaluable.
Enough old content disappears from the internet organically.
- rchaud 6 years ago
  
  > publish new information as a new post, don't silently update old content.
  But the new post likely wouldn't rank as well as the old post, so you are less likely to find it using search. It would be a better user experience if the original post had a clear indicator stating that it was updated with new information on X date.
  - geocrasher 6 years ago
    
    I agree, and will be cross linking posts accordingly.
patrec 6 years ago

The most transparent way would be to add a [First Published: ..., last updated: ...] header. But if the content is really reasonably up-to-date wrt the latest publishing date, I don't think you're doing something shady.
pxtail 6 years ago

This is actually mentioned in google's documentation - you need datePublished and dateModified: https://developers.google.com/search/docs/data-types/article...
Even looking from usability standpoint I think that it is totally okay and even necessary to add 'updated' or 'modified' date. Personally, as a reader I'm relying on article created/updated heavily, if I cannot quickly and easily figure out if article is still relevant and up-to-date I'm usually leaving webpage immediately without even bothering to read.
greenhatman 6 years ago

If there are significant changed, I'd rather write a new post.
You can add a link to the new post somewhere on the old post.
dna_polymerase 6 years ago

On some topics a new post, titled "X Revisited in 2019" would be the option I'd go for, updating the old article with the link to the new post (without updating the date in the old post).
Granted this would be my personal preference. From an SEO point of view this doesn't make sense, at all.
geocrasher 6 years ago

Thank you everyone for the responses! They were quite valuable. I think the best thing to move forward for me is to fix any issues on the old posts, then write new ones as updates to them. It's more content, temporally relevant, and there's no gaming or sense of gaming involved.
Thanks again!
ghaff 6 years ago

I would re-date and just include a short note to that effect if I had updates of any material nature.
mic47 6 years ago

To me, the date in the blog is indication how relevant it is. So if you updated links and info in 2y blog to make it relevant again, I don't mind if the date would be today. But probably best would be to put 2 dates: original publish date, and date last modification/relevance.
carapace 6 years ago

Add a little changelog.

pjc50 6 years ago

Not a particularly detailed post and, as it says, it's an old SEO trick.

But if people want an answer as to why the blogosphere is dead and everything's on centralised silos: this is why. Any decentralised system that doesn't take spamfighting into account from the beginning will drown under it as soon as it becomes popular.

WA 6 years ago

I don't follow your conclusion. This isn't spam, it's just an updated timestamp. Furthermore: Google IS centralized. This isn't about decentralization, but about a central platform (Google) showing the "wrong" date and maybe even treating these blogs in a preferred way. Not sure what this has to do with blogs being decentralized.
- pjc50 6 years ago
  
  The blog itself, as a whole, is spam.
bagacrap 6 years ago

Is the blogosphere dead though? I see links to tons of previously unknown to me blogs on this site. The community has moderated the spam out. Thanks guys!
- rchaud 6 years ago
  
  It is "dead" in the sense that you are more likely to comment on the post on a third party site like HN or Reddit. A decade ago the conversation would be happening on the author's site, in the comments section. That died off because bots and click farms flooded comment boxes with spam and chased away legitimate visitors.
  Authors of popular blogs didn't want to handle comment moderation, and once they started encouraging discussion directly on FB, Reddit or HN, it was game over.
- golergka 6 years ago
  
  This site is a centralized system in some sense though?... As in, centralized source of moderation and authority, at least.
flagpack 6 years ago

It’s not just the blogosphere that is dying because of this „trick“. Some bigger mainstream media houses (at least in Germany) do the same thing to their news. They change the date and headline so often that you happen to click the same news several times in one or two weeks, only to realize „oh, that again, but with a totally misleading headline“

mfer 6 years ago

The "freshness" issue is one that causes problems and subtly sets priorities.

Problems come up when looking for older content on purpose. A lot of older content is very relevant still today. I know people who could not find something they knew existed in Google. Switched to Bing and it came right to the top. The difference was the way newer content was prioritized.

Google doing this sets priorities. It says that newer is more important. Is that true? Many would argue it's not. Google says it is and "advises" people where to go based on that.

I find these worth considering.

zafiro17 6 years ago

Prioritizing new content is going to exacerbate a lot of already existing problems with the WWW. I won't echo the already good list of reasons why old content is relevant. But I will point out it will usher in a new era of re-showing you old content that has been repurposed "as new" via minor edits. And it's going to also help convince a growing audience of web surfers that only the latest/greatest is worth knowing about. That in turn puts additional pressure on content producers to obsess endlessly over SEO tricks and gimmicks. In sum, this worsens the user's experience by prioritizing things that benefit Google. And this, of course, should be no surprise to anybody who has watched Google's business decisions over the past 4 or 5 years.

stebann 6 years ago

Yes, exactly that. Also Google doesn't take actions when you report something or they just dismissed the reports: spam-blogging, spam results when you're looking for references, badly translated news sites copying content from legitimate sources, and the list goes on.

rchaud 6 years ago

This is 100% Google's fault. They discourage both "old content" and "thin content", so guess what happens?

Content writers create posts like "Ultimate Beginnner's guide to X in 2019" articles and just update the post title and "Last updated" metadata each year. Nobody's going to create a brand new guide to building muscle or whatever if the core information doesn't change year to year.

antjanus 6 years ago

I can't be the only one thinking that:

1. it's possible the author double checks the content each year and redates so that visitors know it still applies

2. the author updates the article to be current and redates it

It's a little weird to say this is "blogwashing". It's pretty common (for me at least) to check the date of an article when it's a tutorial so I know if it's current or not. And I've seen this happen before where authors append a "changelog" to the article at the end so you know that it's up to date.

nickjj 6 years ago

How does everyone feel about only showing the updated date on the blog post itself but keeping the real / original published time in the meta tags?

I do that on my site mainly to keep things less cluttered. Every post has an "Updated on November 12th 2019 in #docker #flask" line at the top of the post and that date is either the original published date or the last time I updated the content in the post, but the meta tags are always the correct values (ie. I don't refresh the published date with the updated date).

But now it's making me think I should include both the "Posted on" date as well as a separate "Updated on" date in the presentation of the page itself to be crystal clear. My only concern with that is that will eliminate some vertical space on the page because I can't fit all of that on 1 line cleanly. I would have to break the dates and tags onto 2 lines. For example, this is what a current line looks like: https://nickjanetakis.com/blog/make-your-static-files-produc...

markosaric 6 years ago

On some of my sites, I only list the updated date too. I prefer to keep it simple for the average visitor. The reason is that these topics are more time sensitive so I regularly update all the posts to list latest information, advice etc. Having the original date there might give a bad impression to some and they may think the content is old and outdated.

codingslave 6 years ago

Search is just getting worse and worse, its a problem that probably nobody can solve with the current paradigm. I think in a few years there will be a significant opening in the search market for an algorithm that manages to structure information differently

pjc50 6 years ago

Search is a battleground. There are huge amounts of money to be made from "worsening" your search results by steering them to inferior products. So search engines and blackhat SEO are locked in a permanent struggle.
tambourine_man 6 years ago

Or, some public database of the crawled web where we can apply our own algorithm.
I've always dreamed of a grep for the web, for instance. Trying to Google for code is a pain, even when quoted/verbatim.
- istjohn 6 years ago
  
  This exists: commoncrawl.org
Spooky23 6 years ago

That’s because the web is dead. Lots of passion projects that would have been on a blog or websites are locked into Facebook or even Twitter threads.
My local historical society decided to produce lots of content and exclusively post to Facebook. It’s incredibly dumb, but people seek the easiest path.
- codingslave 6 years ago
  
  This is a good point, the internet is largely monopolized
semiotagonal 6 years ago

Yes. Don't hate the player, hate the game. If the visibility of your work is determined by it's "freshness", then freshening up makes sense. If everyone does this, then Google et al will have to find a better criterion to rank by.

ddevault 6 years ago

Another thing I see a lot is scammers scraping my blog's RSS feed and re-publishing my articles on their site, then filling it with adware. Sometimes the page they show to googlebot is completely unrelated to the page you get when clicking through.

imgabe 6 years ago

I thought Google heavily penalized your site if the human version was different from the googlebot version
- jellicle 6 years ago
  
  The fact that Pinterest links still appear in every search suggests this isn't true, or isn't very true.
- kevingadd 6 years ago
  
  How do they detect that? Presumably human review, which can't possibly cover every malicious page on the internet. I assume if you report the site they queue it up to be scanned by a human, unless their solution is just to have versions of googlebot that are harder to detect - possible, but if someone is already going out of their way to trick googlebot, I don't know how well this would work in practice.
  As a starting point, your not-googlebot needs to spider sites differently from googlebot (so it can't be detected by traffic analysis), imitate average user hardware well (GPU acceleration + high GPU performance, more realistically slow network, slower CPU hardware, etc), use network addresses not obviously Google's, and imitate user behavior (plausible input events, scrolling, etc). This is within Google's capabilities but is definitely an undertaking and SEO types could eventually identify their strategies.
  - ses1984 6 years ago
    
    >How do they detect that?
    Easy, their crawler has a google bot user agent. Then they sample some number of links with a human like user agent, and diff the output, plug the diff into some algorithm to assess the score.
jonnycomputer 6 years ago

Have you had any luck getting them to take that content down?
- hombre_fatal 6 years ago
  
  It's not really an issue unless they outrank you or rank near you.
  In my experience, Google's dupe content detection is pretty good and their penalty is harsh. I once ran a website that tried to curate and touch up old usenet material that never could rank all that well because of dupe content (another website had monospaced dumps of the usenet content).

ravivyas 6 years ago

Actually, a lot of sites keep updated posts with new information, which is useful. Which I believe makes it harder for Google to figure out who is gaming SEO and who are providing actual value.

zo1 6 years ago

How so? Could they not ML-detect the main body/content of the post and do a simple diff to previous "crawls" of the same content? You would think this gets picked up and somehow "resolved" because it's essentially a duplicate post or entry in their crawled pages DB.
- lolc 6 years ago
  
  > Could they not > ML-detect the main body/content > a simple diff > previous "crawls" > the same content
  Yeah, I could do that in an afternoon :^)
- ravivyas 6 years ago
  
  If there is just a change in date and no change in content it could. But if the content changed, Google may not directly be able to state if the content was useful or not.

WA 6 years ago

What? I thought I read somewhere on Google's webdev infos that they know when an article was first created vs. updated. Not sure if this "old trick" still improves the ranking of an article, but surely enough, it baits people into clicking on a link.

I noticed it a couple times myself. Stuff that's obviously an older article appears in the SERPs as if it was published a few days ago.

sct202 6 years ago

It could be a self fulfilling thing, if the CTR increases on a search listing (because it appears newer to the user) that listing generally moves up because of higher CTR.

jakobegger 6 years ago

Another great trick is just increasing version numbers in your blog posts! People look for content relevant to their version number, so you should just make sure you have copies of your content with all the version numbers people might search for!

And with version numbers, you are not limited to dates in the past, you can even write articles about the future!

Here's a brilliant example: https://gorails.com/setup/ubuntu/20.04

How to set up Rails on Ubuntu 20.04, which will be released in April next year. You can already read the guide today! Some of the links might not work yet, because obviously you can't download Ubuntu 20.04 yet, but once it's released, those guys are bound to be the first ones who had a guide out!

cardiffspaceman 6 years ago

But won't their article be old in April, and get less viewership than the one some other author is holding back until day-of?
disneysockputty 6 years ago

Sir, this behavior is identical to what a Blackhat SEO spammer does. Ymmv.

vinaypai 6 years ago

Interesting that the text in the search bar in the screenshot says "AWS Pinpoint Alternatives" doesn't actually line up with the text at the bottom where it says "Searches related to Send with SES — AWS Pinpoint Alternative" and you only see 5 pages of results. The text at the bottom should repeat the search phrase verbatim.

I get completely different results on Google if I actually search for the phrase in the search bar, with no sign of the blog in question. I see zero evidence that the scummy SEO tactic actually works and a lot more evidence of a faked "Google" screenshot.

wbillingsley 6 years ago

Frankly, I don't see a problem with this if it is done consciously rather than automatically. *

Usually, I'm interested in currency not recency. If, say, a technical article was written in 2015, I don't exactly care that it was written in 2015 but do care very much whether it's outdated today or not. APIs change, etc. If the blogger has re-dated the article, that suggests they believe it is still current, which is useful information to me.

(* - Caveat: no, I've never redated a blog article myself. But I am only a very infrequent blogger anyway.)

lessname 6 years ago

That's not just a thing done by blogs. Several "news" sites like (german) t3n.de do this. Or some sites like cnet.com etc... mostly content like "The best CMS as of {insert year here}" - so mostly things you can use years later. However, it doesn't help if the library isn't receiving any updates anymore. I don't think it's just to get a better search ranking but also to manipulate users (they want to use up-to-date software/etc)

flagpack 6 years ago

I‘ve been working for an online news network in Germany for several years. They hire people (mostly students) to do nothing but update their old articles on at least a weekly basis. Even too a point where it becomes almost unreadable because of so many useless and non-contextual updates. But it works in terms of ad revenue from google traffic.

jellicle 6 years ago

Vast numbers of people are currently employed writing/updating articles with titles like "What You Need To Know About Cats In Mid-November 2019".

This is Google's fault.

UserIsUnused 6 years ago

And it's a 5 year old article that has just edited the title. you still see the comments from 3 years ago

stebann 6 years ago

Old way sharing is in danger when big guys like Google don't fight back spam and black-hat SEO. I try to do my search on different search engines so older content- sometimes well established articles on programming - and meaningful new content can both go to surface. Since four or five years ago if you search something with Google then sponsored content goes first, sometimes it's just brand propaganda.

PretzelFisch 6 years ago

I don't really see the harm in re-dating a blog to keep it high in search results. There is a lot of information that once publishes stays relevant and informative for years, to get dinged on seo because it is stale says more about google then the blog authors. Also they have an image which doesn't tell you if a links url was updated in an edit.

myrryr 6 years ago

Oh gods I hate this, when I'm looking for how to get something done on a framework which has gone under big changes, I need to look up the articles within the last 6 months.

But this bullshit makes that really hard.

s_gourichon 6 years ago

Some Blog

November 13th 2422

This week, Groaar and Mrumfm have been experimenting a new invention. We are considering calling it "wheel". Will keep you informed.

Comments

This is old news. Our tribe has been using it for eons.

mikorym 6 years ago

> It's almost 2020... Google knows about this

Google also knows about vertical search [1] and actively destroys anyone who pops up with a good new algorithm and hope for a startup.

[1] https://en.wikipedia.org/wiki/Vertical_search. I am pretty sure there has been a HN frontpage article about a couple with a vertical search startup that was legally and practically destroyed by Google.

jessriedel 6 years ago

First, does this have anything to do with the OP article? Second, care to link to anything to substantiate your claims?
- mikorym 6 years ago
  
  > what does this have to do with OP
  OP made the point that Google does not want to address this problem and may even facilitate it (maybe passively). However, they have actively prevented progress on search engines that may be more difficult to SEO engineer/hack (because of their specificity) and particularly have prevented people with good vertical search algorithms from building a business (read: actively sabotaged their business). [1]
  In any case, Google known as well to quash any other small projects that they feel challenges them. [2] [3] [4]
  [1] https://www.nytimes.com/2018/02/20/magazine/the-case-against... and https://news.ycombinator.com/item?id=16420004 [2] https://news.ycombinator.com/item?id=19553941 web browser [3] https://news.ycombinator.com/item?id=18566929 person's idea taken after interview [4] https://news.ycombinator.com/item?id=19124324 business
  References [2]–[4] are not specifically important; there are easily a dozen such kind of complaints if you just search for "google" on HN.
  - jessriedel 6 years ago
    
    Wouldn't leaving a persistent flaw in Google search results which could be avoided by using a vertical search engine be an example of Google assisting vertical-search start-ups?
    
    mikorym 6 years ago
    
    Maybe in theory, but if you look in particular at the reference labelled [1], it seems like they prefer to act as though it doesn't exist and if other people try to get vertical search going, then they simply block their startups.

dbatten 6 years ago

Am I the only one who still bristles when people say blog but mean blog post? Or is this generally accepted now?

cmsd2 6 years ago

what's a blog? oh you mean a weblog. /s
- dbatten 6 years ago
  
  I'm assuming your point is that words get shortened for convenience and that that's OK. That's a fair point. It's also true that language doesn't always evolve in ways that make sense, and I get that.
  With that being said, there's a huge difference between shortening "weblog" into "blog" and shortening "blog post" into "blog":
  First, when "weblog" was shortened to "blog," "blog" didn't already mean something (and certainly didn't mean anything in the relevant context). When "blog post" got shortened to "blog," "blog" already had a meaning - AND it already had a meaning _in the context of the internet_. One of these leads to confusion, one of them doesn't.
  Second, when "weblog" was shortened to "blog," we didn't already have a shorthand way of saying "weblog." But we've been shortening "blog post" to "post" basically since the beginning. There was no reason to shorten it to "blog" also. "Post" was just fine.
  I'd argue that a more fair comparison would be if, after using "blog" for a while, we decided to shorten "weblog" into "web" instead. It would have been silly, because "web" already meant something, and because we already had a shorthand version of "weblog" (i.e., "blog") - so why did we need another?
  But I guess your sarcasm and the down votes answer my question anyway. The internet has accepted "blog" as meaning "blog post." I might as well get on board.
- petercooper 6 years ago
  
  I seem to recall we called them online diaries or e/n sites before that :-D

Settings

Blogwashing

Keyboard Shortcuts