Ask HN: Rule of thumb for identfying fake web traffic
We suspect that our marketing firm is generating fake web traffic. This is mostly a gut feeling from watching past traffic but we need proof. Our privacy page is suddenly very popular and people hang out on the video page without watching any videos. Is there a good way to tell if it is a bot from the server logs? Any input would be appreciated. Try a honeypot maybe - create an invisible link and see if it gets visited. If their bots are dumb enough to visit the privacy page then they may fall for this too. In general looking into anomalies in GA is a good bet - find differences in not-so-important metrics like browser shares, countries, network names, screen sizes, time-of-day, etc. you should find enough data to at least convince yourself. It's hard to fake everything properly. Try actually looking at server access logs and see if you notice any patterns - UAs (phantomjs?), IPs, ... Thank you raquo that worked like a charm. We now have them dead to rights. I will do a write up for HN at some point when everything is resolved. Simple and effective solution, thanks again. Heh, awesome that it actually worked! Your gut is probably correct. Sounds like PPV Networks. Resources: Good luck and report back. Expose them right here on HN. - Check user agent of each visit. - Check IP address/Geo location of visitors. If suddenly India or Pakistan developed an unhealthy interest in your site pages - you'll know that it's all a bogus. - Check Referrer of each visitor. This could hint you either where visitors are coming from or that these are bots - if suddenly no visitors have any referrers You could try interacting with these visitors with a live chat tool. Real users will either minimize / close the window or interact with you. Bots will likely do neither.