Settings

Theme

Why we'll use Google Universal Analytics over Mixpanel and KISSMetrics

blog.fleex.tv

53 points by duwip 13 years ago · 26 comments

Reader

snide 13 years ago

As someone who has built several top 1000 trafficed websites over the past decade here is what the publishing industry definitely needs out of an analytics program.

1. Please give me a report that can prove that my user traffic is real.

2. Please give me a report that can prove that the traffic is healthy.

I know that I can get this from analytics now, but it needs to be the focus.

For a decade I've competed against content websites that for the most part game seo traffic, build click traps and generally pollute the Internet with secondary source content. I've always had fairly large audiences on my sites, with healthy 50% returning visitor rates. However, when it comes to getting ad dollars, I always lost to competitors who had much larger volume mostly because they were either buying meaningless inbound links or using some other scam like click trap "we recommend this hot girl talking about prostate cancer" photos to goose their numbers. Meanwhile we'd create quality content and my sites would have hundreds of comments, while theirs would have very little. It didn't matter that my audience was more engaged, advertisers bought volume.

I just need something that I can show to an advertiser (or even better, that they have access to and can compare) that says... hey, this website isn't a constructed fabrication made to fake volume and take your money you sucker. This is a real website.

A lot of the industry right now is based upon buying links from aging front door portals (Yahoo, MSN, AOL) which still do ungodly amounts of traffic with a mostly Internet illiterate audience. Sites buy these links, convert them into CPM click traps on their targeted magazine sites and sell their inventory to advertisers who don't know that the whole thing is shell game. They think they're buying ads on a hot new site with explosive growth.

  • lukestevens 13 years ago

    Chartbeat is attempting to help with 1 & 2 with "Engaged Time": https://chartbeat.com/publishing/for-editorial/ . (I've no connection).

    I'm building an analytics interface for GA though and would love to chat about what else publishers need in an analytics interface - luke at itsninja if you'd like to chat.

  • alexatkeplar 13 years ago

    Hey snide - I think we can probably help you at Snowplow Analytics. We warehouse all your atomic event data (including page views and in-page pings - v hard to fake) with IP address, browser fingerprint, 1st party cookie, optional 3rd party cookie, optional business defined-user ID, user timezone, browser features, useragent... If that sounds useful for proving your audience to advertisers, get in touch!

    • corin_ 13 years ago

      After seeing Snowplow mentioned a few times on HN in the last week, each time I've thought "hmm, looks maybe interesting... but I don't have time to figure out what it is or how to use it". Finally just now seen the starting guide, so will probably play around with it sometime soon.

      So piece of feedback is to maybe try and make it easier/more obvious how to go from "this might be interesting" to "what can this do for me?" (I'm still not 100% sure).

  • omarchowdhury 13 years ago

    Do you have more information on how I can see a live view of the practices you discussed in your last paragraph? I was under the impression that traffic from the top portals was costly and not exactly suitable as a component in an arbitrage play like you mentioned.

    • snide 13 years ago

      Visit http://www.yahoo.com/ right now. Scroll through the stories and find one that doesn't link to an internal yahoo or yahoo owned property.

      You'll run into one of the following scenarios in most of them.

      * It drops you on an interstitial ad before the "article" loads.

      * It drops you onto a click trap "gallery" where you are forced to click for each new image.

      More often than not it's to a site that you've never really heard of and isn't very well produced vs. their more well known (to a literate Internet audience) competitors.

joevandyk 13 years ago

I started saving all my page views in a postgresql database. Schema is pretty simple.

I have the following tables:

    sessions
      session_id (uuid type)
      created_at
    
 
    page_views
      page_view_id
      session_id
      created_at
      site_id
      path
      query_string (hstore)
      user_agent
      referral_url
      ip_address
      user_id
      http_method (get, post, etc)
      details (hstore, used to tag page views/actions)
   
This allows me to simply query all my page views against data in my live database. I can see the path a user took to place an order. I can easily integrate a/b tests. If someone uses a coupon on the site and we want to see if they later came back and viewed/purchased more, we can easily write a sql query to figure that out. We can simply figure out lifetime customer value, even if not logged in. If we're getting a large amount of traffic from a certain affiliate, we can alert our staff.

It's really awesome to be able to have your data in the same place. Having analytics data spread out to GA made it difficult to match that data against ours. If we need to scale out to multi-terabytes, postgres_fdw will make querying against the analytical database simple.

Since we're also tracking affiliate purchases to pay out commissions, I also have another table that that stores additional information about a page view if they came from an affiliate site (click id, the affiliate network, etc).

Here's the plpgsql function I use for saving the sessions and page views: https://gist.github.com/joevandyk/f63523cdd1a3aa75d0ec

  • duwipOP 13 years ago

    Yeah, we do that kind of stuff as well. At least you know what your data means. But when you start getting millions of hits a day, you won't necessarily want to spend some time scaling your system... In that case leaving it to the pros and focusing instead on your product may prove the most sensible move.

    • joevandyk 13 years ago

      It should be pretty easy to scale out a simple set of data like this.

      "Leaving it to the pros" means you don't control your data and you can't easily combine it with your other data about products, orders, whatever.

mikeknoop 13 years ago

The last paragraph is important. I spent some time earlier this week when I learned about Universal Analytics -- but quickly discovered that UserID tracking hasn't shipped yet.

Can anyone on the GA team speculate about a release date for the uid bits?

  • hu_me 13 years ago

    userId bit has been there from the start in Universal Analytics. Its called custom dimensions and can be used to send any property about the user into the GA and then link it to a User or a specific Visit.

    https://developers.google.com/analytics/devguides/collection...

    • mikeknoop 13 years ago

      OP (and the article) refer to the uid tracking mentioned here: https://groups.google.com/forum/m/#!msg/google-analytics-mea...

      As of Jun 27 it hasn't shipped according to a GA team member.

      • hu_me 13 years ago

        had missed that thanks for sharing. I have been using custom dimension for uId in our client projects its worked out well, though a dedicated api method is always welcome.

        • duwipOP 13 years ago

          Custom variables don't let you consolidate on users though? In the visitor count, for instance, I don't think there's a way to tell GA to use a custom var to distinguish between visitors.

        • hu_me 13 years ago

          yes there isnt a way within GA to aggregate metrics based on custom dims at the moment. we have on occasion done it through reporting api using cd to maintain a unique count.

j_s 13 years ago

I was not aware that the new analytics would track users. One interpretation of section 7 of the Google Analytics Terms of Service is that tracking individuals is not allowed:

http://www.google.com/analytics/terms/us.html

  > You will not [...] use  the Service to track, collect or 
  > upload any data that personally identifies an individual 
http://productforums.google.com/forum/#!topic/analytics/tTaq...

  > you cannot store names or ip addresses in a custom var, 
  > but you can store ids that need your backend to resolve 
  > into a person identification
  • Brandon0 13 years ago

    Tracking an individual is different than storing personally identifiable information. I can assign you an arbitrary (or seemingly arbitrary) userID (that is unique to you), but does not personally identify you, as a way to track you. This arbitrary userID is meaningless to any third parties. What I cannot assign you, is your name, email address, or even IP address as a way to track you since anyone that sees that information could figure out who it belongs to.

jamiequint 13 years ago

This article is really making a big deal out of nothing. All the "major issues" brought up here only create problems in edge cases. When you're trying to drive growth or understand your users (the purpose of metrics at the end of the day) you should not be focused on edge cases.

In most cases the reason you care about tracking logged-out -> logged-in behavior is to measure onboarding behavior, understanding what the user does pre-signup so you can do a better job of driving signups. Signup is not a multi-client process in the common case so being able to track multi-client behavior pre-signup doesn't really matter at all.

  • duwipOP 13 years ago

    Agreed, these are edge cases. They did create a lot of questions for me though, and made the whole thing rather confusing as a user.

    As to how much of an issue these edge cases represent, I find it hard to get a real sense of it. I guess it really depends on the situation, what you want to measure and the user experience you offer to your visitors.

taf2 13 years ago

My gripe about google universal analytics or analytics.js vs ga.js is

broken backwards compatibility (cookie data is no longer stored in the same way) this was an interface many add/systems used and depend on from the days of Urchin.

Otherwise, new interface is pretty slick, features look good, the API to send data server side is so much nicer.

broken compatibility just kinda sucks though

jdangu 13 years ago

> For one, there can’t be 2 [clientID, userID] couples with the same userID: with the way mixpanel does things, this is essentially a technically impossible scenario (...) And yet one user can access your site through different clients, leading to a systematic overestimation of the number of visitors hitting your site.

Really? Anyone can confirm this behavior? I'm pretty sure KissMetrics doesn't have this limitation.

  • losvedir 13 years ago

    Indeed, and this is why we ended up choosing KM over MP. With KM you just "identify" a visitor whenever you want and if there's already another anonymous cookie, it'll tie together all events retroactively. We couldn't find an easy way to do this with MP when we looked at it.

    • duwipOP 13 years ago

      Yep, it would seem that KISSMetrics has a better implementation where aliasing can be called several times (as stated here: http://support.kissmetrics.com/apis/common-methods.html).

      The fact that it links accounts retro-actively though can be dangerous, in the scenario of publicly-accessed devices. I'll have to admit though, this is not the common case.

      I guess my personal gripe with what MP and KM are doing boils down to: if you can't infer stuff about who is visiting my website, be honest about it and don't.

KaoruAoiShiho 13 years ago

Last I checked User based analytics is directly against the Google TOS. You are not supposed to store any identifying information about specific users, probably because Google has been under privacy scrutiny. So not only is google not for user based tracking they prohibit it, making them a real non-starter in any case.

  • duwipOP 13 years ago

    Check out the Google I/O video I mention in the article if you need convincing. As far as not collecting user data fo privacy reasons, I think brandon0's comment says it all.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection