← Blog

UTM hygiene at 80 campaigns: a field report

Henk van Biljon··9 min read

If you want to measure marketing at the level of individual channels and campaigns, the UTMs are the join keys that connect a click to a customer inside your own systems, and when those keys are inconsistent the joins fail quietly while every number built on top of them inherits the damage without anyone noticing for months. This is a field report on what it took to impose order on roughly eighty live campaigns spread across several platforms, including the things that broke along the way and the convention that eventually held.

We are writing this down because the canonical advice on UTMs, which amounts to "be consistent", is both entirely true and completely useless in practice. Consistency is trivial when you have three campaigns and one person building the links, but once you have eighty campaigns running across several platforms, with the platforms' own auto-tagging colliding against hand-written tags and more than one person creating links on any given day, consistency stops being a matter of discipline and becomes an engineering problem that has to be solved with systems rather than good intentions.

What the audit actually showed

The first honest step was to stop assuming the tags were fine and instead pull every distinct UTM combination that had reached the site over the trailing year, and the result was the usual horror, which we suspect you will recognise almost line for line if you run the same query against your own data:

  • utm_source arriving as facebook, Facebook, FB, fb, meta and facebook.com, every one of them meaning the single platform.
  • utm_medium being used for at least three mutually incompatible ideas at once, sometimes the channel type such as cpc or social, sometimes the funnel stage such as retargeting, and on one memorable occasion the name of the agency that had built the link.
  • utm_campaign carried as free text, with a single real campaign spelled four different ways because four links had been built on four different days by whoever happened to be awake.
  • Spaces, capital letters and the occasional emoji, each of which turns into a different value the moment it is URL-encoded, so that one campaign quietly fragments into several rows that no human reading them would ever treat as distinct.
  • The platforms' own auto-tagging, meaning the click IDs the ad systems append for you, sitting alongside the manual UTMs and sometimes agreeing with them, sometimes contradicting them, with no rule anywhere for which of the two was meant to win.

None of this was incompetence, it was simply entropy, because every one of those links had been perfectly reasonable at the moment it was created, and the mess was an emergent property of many reasonable decisions accumulating without a shared scheme, which is precisely the reason that willpower never fixes it and a system eventually has to.

The convention that actually held

We will not pretend the first scheme we wrote was the one that lasted, because the first attempt was too clever and tried to encode too much, and people went back to quietly ignoring it inside a fortnight on the entirely rational grounds that ignoring it was less work than complying with it. The version that genuinely held in the end was deliberately boring, and a handful of ruthlessly simple rules carried almost all of the weight.

Lowercase always, and no spaces ever, with hyphens between words. This single rule eliminated more phantom duplicates than everything else we did put together, for the simple reason that casing and spacing had been generating the bulk of the fragmentation in the first place.

utm_source is the platform, drawn from a fixed list. It is never a description, a vendor or a best guess, but one canonical token per platform, written down in advance, with anything outside the list simply disallowed.

utm_medium is the channel type, drawn from a fixed list such as cpc, paid-social, email and display, plus a small handful more, with the funnel stage and the audience both explicitly kept out of this field, because those two were the most common abuses we found and each of them needed a proper home of its own elsewhere.

utm_campaign is structured rather than prose, built from a small number of fields joined by a separator in a fixed order, encoding only the dimensions we already knew we would want to segment on later, which were a short campaign code, the market or geography, and the funnel stage that kept trying to sneak itself back into medium. Everything else that people understandably wanted to record was sent into utm_content, which we treated deliberately as the free-text escape valve so that the structured fields could be kept clean without anyone feeling they had nowhere to put the detail they cared about.

The entire specification fit onto a single page, and that turned out to matter more than any individual rule, because a convention that nobody can hold in their head is a convention that nobody actually follows.

Enforcing it at the point of link creation

A naming convention living in a shared document is only ever a suggestion, and suggestions reliably lose to deadlines, so the thing that genuinely changed behaviour was moving the enforcement to the one moment that mattered, which was the creation of the link itself.

We built a small link builder, which was honestly just a spreadsheet to begin with and only later became a proper internal tool, where you selected the source and the medium from dropdowns containing the allowed values and filled the campaign fields into separate cells, and it assembled the finished URL for you, so that there was simply no way to type Facebook because facebook was the only option the list would offer. What that did was push the cost of doing it correctly below the cost of doing it wrong, and lowering that relative cost is about the only thing we have ever seen reliably change how busy people behave under deadline.

Sitting alongside the builder was a validator, a scheduled check that scanned the incoming UTM values against the allowed vocabulary and flagged anything that failed to conform, together with the offending links and where they were running, and in its first week it lit up like a fire alarm before settling down within about a month to almost nothing, because the builder was catching the bulk of the problem at source while the validator caught the strays that slipped around it. The validator never stopped earning its place, though, because every new platform we added and every new contractor who joined reintroduced a little fresh entropy, and being able to catch that within days rather than discovering it a quarter later was the whole difference between keeping a clean table and slowly sliding back into the swamp.

The backfill, which was most of the work

Clean new links do nothing whatsoever for the year of history you already have, and that history is the large majority of what you will actually analyse, so the backfill is where most of the real effort lived. The obvious temptation is to go back and re-tag the old campaigns in place, and it is worth resisting firmly, because re-tagging anything that has already run severs the connection between what was genuinely recorded at the time and what you have since altered, and you can never afterwards fully trust a number built on top of that.

The approach that worked instead was a mapping table, where we enumerated every distinct dirty value the audit had surfaced and mapped each one, by hand for all the ambiguous cases, onto its clean canonical equivalent, so that FB, Facebook and meta all resolved to facebook and so on down several hundred rows. That table lived in the warehouse and every query that touched marketing data joined through it, which meant the old dirty data and the new clean data both resolved to the same dimensions at read time, and the original raw records were never mutated in the process.

We want to be honest about the parts of the backfill that simply did not resolve cleanly, because some of the old utm_campaign free text was genuinely ambiguous, with two real and different campaigns having been handed the same string at the time and no surviving way to tell them apart after the fact, so we made a documented judgement call and accepted a known small error rather than pretending to a level of precision the data could not actually support. Anyone who tells you that a historical backfill of this kind comes out perfect has either never done one or is not counting the rows they quietly threw away along the way.

What it bought

The point of the whole exercise was never tidy reports for their own sake, it was the join, because once the keys were clean and stable the marketing data finally connected reliably to the CRM and through it to the policy outcomes, and that connection is the precondition for every number actually worth having, including cost per bound policy by genuine channel, retention by acquisition source, and the unit economics we work through in The Seduction of CPL as a Metric. Before the cleanup those joins would run, return a confident-looking number and quietly lie about it, whereas afterwards they ran and the number actually meant something.

It also made the whole platform-versus-CRM reconciliation tractable for the first time, because the CRM side of that comparison was now built on keys you could trust, which meant the gap left over was the real and structural gap we describe in Why your ad platform numbers will never match your CRM, rather than that genuine gap with an additional fog of tagging noise layered on top of it and disguising where the true problem actually sat.

If you are about to do this

A few things we would tell anyone starting out, in roughly the order they matter:

  1. Audit before you design anything, by pulling every distinct UTM combination you have genuinely received, so that you are designing the convention against the real mess in your own data rather than against a tidy textbook example that does not resemble it.
  2. Keep the convention boring and short, ideally to a single page, with fixed lists for source and medium, structure imposed only on the dimensions you will truly segment on, and a deliberate escape valve for everything else so that nobody feels driven to abuse the structured fields.
  3. Enforce it at the point of creation, because a link builder with dropdowns will beat a written style guide every single time by making the correct thing also the easy thing to do.
  4. Validate it continuously, since the entropy returns with every new platform and every new person who joins, and a standing validator is what keeps that creep in check before it compounds.
  5. Backfill with a mapping table and never in place, resolving dirty to clean at query time, documenting honestly the calls you cannot make cleanly, and counting rather than hiding the rows you end up having to drop.

This is not exciting work and it does not demo well in a meeting, but it is the foundation underneath every attribution claim you will ever go on to make, and choosing to skip it does not actually remove the cost so much as defer it to the day some number you had been trusting all along turns out to have been noise the entire time.

Frequently asked questions

What is a good UTM naming convention?

A controlled vocabulary rather than free text, meaning lowercase only, no spaces, a fixed list of allowed sources and mediums, and a structured campaign field that encodes only the few dimensions you will actually segment on later. The exact scheme matters far less than the fact that it is written down, enforced at the moment a link is created, and never edited by hand inside the ad platform afterwards.

How do you fix inconsistent UTMs across many live campaigns?

Freeze the convention first and enforce it on every new link, then backfill the history with a mapping table that translates the messy old values into the clean new ones, rather than re-tagging live campaigns in place. The lookup should live in your warehouse so that old and new data both resolve to the same clean dimensions at query time, without anyone ever mutating the original records.

Why does UTM hygiene matter for attribution?

UTMs are the keys your CRM and warehouse rely on to join a click back to a customer, so when those keys are inconsistent the joins either fail silently or, worse, succeed against the wrong rows, and every downstream number quietly inherits the error. Clean UTMs are the precondition for measuring anything per channel or per campaign with any honesty.