How to use UTM parameters to grow your audience
smashnotes.com“Grow your audience” is a stretch here. Better to say “...to see where traffic comes from.”
If you want to go a step further and see which traffic sources lead to form completions (like newsletter or app signups), I made a utility that captures UTM parameters and then inserts them into any form submitted during that session: https://github.com/gkogan/sup-save-url-parameters
I made a startup to try and answer one question further: connect incoming traffic to conversion events and assign credit. This is where you'll get a real bottoms-up view of your marketing effectivness.
That's really useful! It's actually surprising that something like this doesn't come as a default with Mailchimp forms. Just forked it and will give it a try. Thank you!
I'm pretty sure I saw this somewhere in one of the other marketing automation tools but yes it's definitely not common.
Kind of related, I saw a webinar by Ontraport recently and they were going a step further by saving google analytics IDs against contacts for deep funnel conversion tracking. The idea is they don't really care about who signs up for a free trial, they want to know who upgrades to paid within the trial period and usually the attribution is all messed up by then or they don't know who is who. So saving details and manually calling google will help to track this.
It seems there are still a lot of little tricks out there that aren't standard practice.
Yea, that combined with the User Activity API is incredibly handy. Especially useful to pull out when some anomaly occurs, and stakeholders suddenly want a particular conversion under the microscope (an unexpected enterprise user, unusually large order value, an unexpectedly frequent recurring customer, etc).
That said, all of the browsers are becoming so aggressive about cookie management that relying on the Google-set clientID is asking for pain at this point, unless you've implemented something like this[1] cookie setting service/proxy/relay. Creating your own first-party identifier (even if it's just randomly generated the same as the GA clientID) and leveraging the user-id field is far more stable.
> It seems there are still a lot of little tricks out there that aren't standard practice.
Simo Ahava's blog (linked to in [1]) is a wealth of nifty web analytics tricks (primarily focused around Google Tag Manager, Google Analytics, and a sprinkling of Snowplow[2] for the self-hosted crowd). The blog is geared at a less technical audience, but the vast majority of content is around technical nuances of analytics implementations and fairly minor coding adjustments that can be made to significantly enhance a standard implementation.
[1] https://www.simoahava.com/google-cloud/create-cookie-rewrite...
In the near future tracking won't be as easy as throwing a script from Facebook into a blog header. I wonder where it's leading. One guess I had is people will be plugging in web service calls to get the info behind the scenes and generate a first party cookie. Not sure if that capability exists yet but it does sound like your first link so I'll have a read, thanks.
Edit: I had a look through that first link, nice hack, but definitely out of reach of most small businesses. That's kind of how these things go, big players have the resources to jump a hurdle without slowing down, everyone else is left behind.
For purely analytics systems, it's already pretty easy to leverage server-side calls or an internal proxy service to route figures, which does truly mask where the data is going (but should still be disclosed in your privacy policy).
And many of them offer a hybrid approach called CNAME cloaking[1], where you CNAME a subdomain on your host to the analytics/marketing system. That way you still leverage their infrastructure but they gain access to the first-party context. Here[2] is an example of that for Adobe Analytics.
Google Analytics doesn't officially support a CNAME implementation, so the above doesn't apply. The cookie service I referenced in the parent comment is a workaround for that. The GA code is still all third-party, but you create an internally hosted microservice that will set cookies when called out to. You hit that service (even via a third party tool like Google Tag Manager), it sets an appropriately named first party cookie in it's response, and you suppress GA from triggering it's usual cookie-setting properties (which would overwrite the first party cookie and get hit with ITP restrictions).
That said, ad networks are a different beast entirely. Some of them offer a CNAME cloaking implementation option, but it's less common than you'd expect. And few if any of them allow internal proxying of the tracking data. There's simply too much potential for fraud and too little trust between parties. "Offline" conversion tracking is pretty commonly supported though, which involves having a site capture a click identifier that an ad network appends to an ad click, then in an out-of-band process, upload conversion activity to the ad network using that click id as a key. Precludes the super invasive browser-side tracking, while still allowing for attribution and media effectiveness analysis.
[1] https://dev.to/dnsadblock/cname-cloaking-or-how-are-we-being...
[2] https://docs.adobe.com/content/help/en/id-service/using/refe...
How are you seeing companies solve for view through tracking in this landscape?
From which perspective?
For ad networks: the solve seems to be more disclosure from advertisers of customer PII[1]. All of the recent tracking prevention and cookie restriction measures are disruptive to last-mile analytics, but are far less disruptive to the overall strengths of device and identity graphs. Offline/out-of-band data feeds for click-through tracking don't require passing any identity data, since the click id acts as a key to connect the result to the associated click action. You don't have that for view-through, so instead you pass identity data and that's used to associate the result with the network's device/identity graph and attribute it to any relevant view through action that occurred. But because the advertiser is blind to which conversions/results may be relevant, they have to disclose all results and associated identity data in the process.
Caveat emptor: The above is based on my observations working in digital analytics in general, but my primary focus is in a different area. So there may be nuances or aspects that I'm not cognizant of.
For advertisers: The three main options tend to be either blissfully ignoring (or consciously accepting) the visibility gap, move towards the greater data disclosure of the above solution if you (and your lawyer) are comfortable doing so (which maintains the status quo for visibility while disclosing significantly more data to ad networks), or invest internally in the tech and resources to perform the tracking themselves (which gives the advertiser visibility, but keeps the ad network blind). For option three, you can self-host something like Snowplow[2] and abuse it as a poor-man's ad server for tracking purposes. The Cloudfront implementation model gives you the throughput and latency to allow you to use it as a view-through pixel, and you can then put it on any placements/networks that allow you to use an advertiser-provided tracking pixel.
[1] In hashed form, for what it's worth. The liability of disclosing raw PII is too black and white for comfort, but the security theater of obfuscated PII via hashing hasn't been tested in court well enough to put a dent in that practice.
I was actually curious about all of them as I've been both buy and sell side. But I'd like to dig further on the buy side.
I've seen what happens with the consciously accepting route--not pretty. Furthers distrust of a channel many already have inherent concerns and varying degrees of understanding, which is a dangerous combo.
Giving the data is what I suspect many will do unless they have sufficient resources for said legal team and technology, or care a great deal about leaking data.
The last option is interesting. So I've used Snowplow (was actually an early user that sponsored them adding UTM remapping). I'm curious how you approached using it as a poor-man's ad server with Cloudfront which I'm less familiar with. Are there any technical write-ups you could point me to?
This was a big part of the reason we built our repo https://github.com/posthog/posthog to enable first party analytics. It'll grab UTM tags all the way through to individual user behavior in your app and then provides an analytics UX on top. Disclaimer: I'm one of the founders.
But can you do anything about view-through data?
You can also check out https://github.com/medius/utm_form which can handle a lot of use cases.
(Disclosure: I created this to help my customers at https://www.terminusapp.com)
Nice product. I came across it recently when researching other accessible methods for capturing UTM parameters. Yours was the only other one I found.
I’ve used multiple marketing platforms, some costing $xx,000/year, and still haven’t seen this feature.
Well, those parameters are intended for GA to parse, so there's several reasons for analytics competitors to not add those:
1) they make it easier for the marketer to setup campaigns with a paid competitor, and then drop the subscription for that software and monitor them with GA only
2) they make GA top-of-mind for marketers every time they look at a link with UTM in it, even in somebody else's marketing dashboard (free advertising for GA)
3) GA gets to collect campaign data, freeriding on the analytics competitor
Nice work! Have you considered using localStorage instead of sessionStorage so it works across browser tabs as well?
I did consider it recently, but decided against it as it would mix last-touch with last-tagged-touch attribution. If you fail to tag a campaign link it would be good for the conversion to not have data, so you recognize the error and tag the link.
If I were to do localStorage I would only do it if the first touch and last touch tags (even if empty) were saved as separate values. That just means slightly more coding which I haven’t gotten around to. Pull requests welcome :)
Does thay difference have implications under GDPR or California's new law?
No, it’s stored on the user’s own computer only. You could put an expiry date in there as well so you don’t accidentally use it after, say, 90 days.
I occasionally change them to something random, like "potato_world_weekly_back_cover". Figure it might provide a little entertainment to bored marketeers.
Bored marketeer here. It’s much appreciated :-D.
I run a newsletter company with almost 500k subscribers. When I started out 10 years ago, I was a fan of Google Analytics so added these parameters to all links in my newsletters. This was uncommon at the time and it turned out to have a huge impact on our growth as lots of webmasters were glued to their analytics at the time, wondered who we were, and Googled the name of our newsletters! I got emails or tweets every week saying as much and thanking us for linking to them, etc.
In the last five years it hasn't come up at all as everyone is doing it and people seem to dig through their analytics less than ever before, but if it helps you track links to your own content, as being shown in the article, it's certainly worth a go.
Another amusing point is that HN didn't used to strip these parameters, so sometimes I could see when people had reposted things from our newsletters on to HN (and kept the utm params in) which was always a buzz :-)
Just to add to Peter's comment, the one issue with links via email/newsletters is that they show up as 'direct' traffic with no referral, so they are about impossible to link back to anything w/o the utm params.
Yeah, "direct" traffic is so elusive. It's just a big bucket of visitors. It's a pity because I would have loved to reach out to folks who send this traffic and thank them, or ask for their thoughts on the product...etc.
Mailchimp adds their own trackers on top of the links you insert into the newsletter, which they use for tracking open/click rates, but that's almost less interesting than the source.
Why does it matter that I have a 36% click rate on the last issue, is it good? What if instead I have 10% click rate, but 1 of those people shared it with 10k more?
I understand why in general people don't like tracking, especially here on Hacker News, but I think some of it could lead to much better outcomes for all parties involved. I wonder what other solutions there are to help piece together your audience, and their interests.
You can make a dummy page for each email link that simply redirects to the correct page with the proper tracking information.
Just about any email platform that does link tracking uses this approach and many can set custom UTMs for you automatically with those tracking redirects.
I think that is exactly what Peter ended up doing for his newsletters.
We have a redirect so we can measure traffic in aggregate (we don't track who clicks what) to refine our content over time. But.. I don't believe we get set as the referrer the way we do it (30x redirects). I believe this is why Twitter's t.co redirector DOES use a true "dummy page" as then they get the referral. It's something that might be worth trying though..
Ah, okay... that makes sense and is a great setup. Did you run into any issues with spam filters when you made that switch?
Nope, it's what Mailchimp, etc., do. All the outgoing links just fly through their server on the same hostname. We do the same with our hostname.
I have evidence that Google follows these links though, because if you link to a "bad" site even through a redirector, Google notices and will throw you into spam or say you're phishing, etc :-)
Thanks Peter, I'm wondering, why did you stop using UTM params?
A handful of reasons all sorta collided.
First, we had some issues with certain sites not working at all if we added them and since our link forwarder was doing it automatically, it made life difficult. We came up with a way to turn them off on an adhoc basis but it was annoying.
Second, I didn't feel they were really moving the needle in any useful way. I don't think people are looking at their stats every day like they used to.
Third, it just felt like more litter/junk for tracking purposes. While there's no privacy aspect to it, I just felt like going a little cleaner in this regard. But.. I can't say they won't ever come back :-)
https://addons.mozilla.org/en-US/firefox/addon/utm-tracking-...
I have never observed a “utm” query param actually improve the quality of the response.
We all know why this obviously positive functionality isn’t built into the browser: because browser vendors rely on hostile business practices to survive. Still no technology to transact with the site you’re visiting for their content....
> I have never observed a “utm” query param actually improve the quality of the response.
You have it backwards. It's more tracking for the webmaster/marketer to know what was popular and what wasn't.
Reminds me of a time at my previous job where I was asked to add a utm_source to API client calls for one specific type of client. Less for marketing, more for just the product management side.
Yes, and despite being easily strippable this functionality is preserved, intentionally, by browsers.
I came across another article[1] on this and they state to "Never use UTM parameters on internal links (e.g., homepage sliders, internal banners, or internal links on blog posts). Clicking on those internal links will cause the current session to end and a new session to start—attributing the new session to the source/medium used on the internal link."
That's definitely a really common issue. And correcting internal links when that happens is always a painful process - the artificially inflated site traffic figures it creates have likely been celebrated and publicized internally, and correcting the analytics (and the subsequent drop in traffic stats) gets treated as if you actually killed real traffic to the site in the process.
Can you use your own domain as the `utm_source` and therefore, even if the session is being reset by an update in source, you will know that the sources is your own domain?
Might even be good, as a way to calculate how well your internal links are working in keeping the users within the property.
For example, put a link at the end of the blog post, and then track how many users clicked it, vs. how many exited on that page.
The two issues you run into are:
- The primary attribution model in Google Analytics is last non-direct visit prior to <conversion/goal/significant event>. Leveraging the source tracking parameters (utm_*) on internal links obliterates the usefulness of much of the built in attribution-related reporting. Relatively recent enhancements do provide visibility into multi-channel funnel flows, but the capability is compartmentalized into a few purpose-built views and don't filter back to most of the core reporting.
- It also drastically skews engagement-related reporting, overall and at a source level. A new session in GA constitutes a completely new visit. So metrics like visits per user, new vs existing users, average visit duration, pages per visit, etc all get whacked in the process. And understanding user behavior and conversion paths from any given channel becomes somewhere between torturous and impossible, as the actions performed in the "new" session aren't visible when looking at the user flow of the original session.
What you're describing is possible and commonly done, you just can't do it with the utm parameters intended for source tracking without having disastrous side effects. Custom dimensions/metrics[1] or event hits[2] are the intended ways of doing that. For developers, Google's autotrack[3] library is useful for implementing that with fairly minimal dev overhead (mainly, seeding your markup with relevant data- attributes so event hits can be auto-populated and sent). There are also a lot of third party libraries for framework-specific integrations. And for non-developers, Google Tag Manager (or any other tag manager) makes event tracking pretty straightforward to implement without dev coordination.
[1] https://support.google.com/analytics/answer/2709828?hl=en
[2] https://support.google.com/analytics/answer/1033068?hl=en
Does it still get counted as a new session if the domain is an excluded referrer? I thought not.
In any case, I agree and these workarounds are never worth the data doubt they introduce. There are ways to track internal flows without utm tags.
That I'm not sure of. My hunch would be
- If you use utm_source=mysite.com by itself it likely would, as the medium would default to referrer and the data sent to GA would be the same as if no explicit parameters were defined.
- If you use utm_medium and set it to anything but referrer, it wouldn't be suppressed. I.e. mysite.com / referrer would get suppressed by the exclusion list and processed as a direct hit, but mysite.com / email would not trigger the exclusion list.
- If you use utm_source=mysite.com, don't use utm_medium (so it defaults to referrer), but leverage other parameters such as utm_campaign, I have no idea how it'd handle that state. I could see it either overriding the exclusion filter due to the explicitly stated campaign, or working like the normal exclusion filter and silently eating the additional utm parameters in the process.
Now I'm curious though - will have to test it out on a dummy property and see what happens!
Neat URL is a Chrome extension that automatically removes all that cruft from URLs. Not affiliated with the author, just sick of people like this author littering in my address bar.
https://chrome.google.com/webstore/detail/neat-url/jchobbjgi...
I was looking for that. Now and then I manually add "Your_Mum" as a campaign and other profanities as utm_source
I don't understand why we have to store the entire dataset in the URL, when something like ?utm_id=1 (mapped to the others in some database) would do. It's stuff like this that prevents average users from "getting" URLs; they don't know what UTM is and assume it is important to get to the information.
This garbage breaks the UX of the internet.
You don't have to; GA actually does support precisely the scenario you're proposing - you can use a random campaign key/identifier in a utm_id field, and upload a custom dataset mapping to GA that contains the details to map to the other utm fields. This[1] support article walks through the process.
You just rarely ever see it because marketing execution is already an operational nightmare at most places (even more so at scale), and the centralized coordination required to leverage the campaign key method introduces a lot of friction, overhead, and potential bottlenecks into processes with little perceived gain. The decentralized model (i.e. entire dataset in the url) is far easier to enforce compliance with between all parties, as you can providing tooling (i.e. a link builder spreadsheet) that allows each party to integrate the usage into their workflows with minimal overhead or ongoing coordination required.
[1] https://support.google.com/analytics/answer/6066741?hl=en
Thanks for the explanation. This really does makes sense.
One other thing to note is that there are a few alternatives which tend to be a bit more feasible to implement (and sustain). Namely, using a single campaign key but leverage an encoding scheme for the key/id rather than a randomly generated one. That way it can be parsed at collection time and every permutation doesn't have to be predefined in a dataset ahead of time. Then end users can still self-service link creation with a template provided to them (they input the usual utm parameters they're used to, but it spits out a link with a single encoded key).
Then on the receiving end, you use lookup tables or functions to decode the various aspects of the key, and explicitly define the relevant utm parameters in the call to Google Analytics (which it natively supports as an alternative to implicitly sniffing them from the URL string). Or if your encoding scheme is simple enough, you can send the key directly to GA and use Advanced custom filters[1] to decode the utm parameters there.
[1] https://support.google.com/analytics/answer/1033162?hl=en
I've implemented this approach for my teams before. It's always a giant PITA without lots of effort around tooling to enforce conventions. It's useful though if you're trying to marry analytics tools without reinventing the wheel though because you can take out of the box UTMs and just transform them into some intelligible encoded format that first into some other tool's sole field for user-defined data.
Web analytics as a whole is a giant PITA without lots of effort around tooling. The default approach just has a silent failure state that is harder to detect, and becomes more of a hassle for the analyst that has to cope with the resulting data.
At least with this approach, the PITA becomes more explicit and easier to justify the tooling needs, allowing for it to be addressed and tackled at the outset by the upstream implementation team, rather than cascading to downstream consumers.
There is no need to, but doing it with id numbers would add a lot of friction on the content creator's side so that's why it isn't like that. The 'some database' is the problem - who owns it?
Right now I can add whatever UTM parameters to any links I send without keying it into a database system, submitting a CR to the IT team, submitting it to Google, or any other nonsense. Details will be collected immediately whether through web logs, google analytics, or something else.
If one email that I send has links to multiple domains (some of which I don't control due to SaaS eating the web) then we'd either have to track and manage multiple IDs which change for each link or use a scary looking GUID generated by a centralised system (no thanks). That sounds like a real headache.
Hiding it behind an ID would make it prettier but it also reduces transparency to the end user of what is being collected. I think if it's going to be there anyway, I'd rather see it.
Thank you for explaining the verbosity in more detail.
Most browsers are switching to hiding the full page url from the user anyhow, so although the parameters are making the url ugly, it's not exactly breaking the UX.
The url is not the UX anymore.
This is why building "the obvious next steps" [1] is such an important concept. This is also why apps like Facebook or Pinterest have been so successful. Within this new UX, users are not looking to leave, they just want to find more information that is relevant to them.
[1] https://www.gkogan.co/blog/ridiculously-obvious-next-step/