Skip to content

Solutions For Google Analytics Referrer Spam

Share Button

Referrer SPAM is not new to Google Analytics, the team has been battling it for a number of years.  The SPAMmers approach is as follows, buy a domain that sounds like a product or service that webmasters may be interested in (video–production.com is a current example).  Send a hit to a random Google Analytics Profile ID via the Measurement Protocol using the the recently purchased domain.  The SPAMmer then waits for curious webmasters to review their GA referrer report and visit then visits the offending domain.

These domains are typically pretty easy to spot as they will all:

  • Typically have a 100% ‘new user’ rate
  • Have either 100% or 0% bounce rates
  • Only land on the home page

Others have included reasoning behind location, user agent etc, however an evolution in the approach has led to those filtering mechanisms no longer being effective.

To send the data to your Google Analytics profile, the SPAMmer needs to know very little about your site (in fact they don’t really even need to know anything about your site).  The majority of those using this method at present have the details of:

  • Your Domain Name (www.your-domain.com)
  • Your GA Profile ID (UA-XXXXX-Y)

Then, utilising Google’s Measurement Protocol a hit can be structured with minimal data to achieve the desired outcome, this might look as follows:

https://www.google-analytics.com/collect?v=1&t=pageview&tid=UA-XXXXX-Y&cid=xxxxx.yyyyy&dr=http%253A%252F%252Fvideo--production.com&dh=www.yourdomain.com&dp=%2F

You can play with this format of hit at the Google Analytics Hit Builder.

The good news is that Google is well aware of this approach, but is having some trouble effectively blocking all the data.  There are significant issues with reliably identifying what data should be included and excluded from processing.  While Google Can likely identify which domains most commonly undertake this type of behaviour and the IP addresses and subnets most commonly at fault, but that would prevent them blocking anyone using a botnet or hits being sent via JavaScript from the browsers of unknowing victims.

Historically, there have been a number of known participants in this approach which made them easy to block, more recently however, the approach has matured through the use of botnets and a much larger pool of domains, as a consequence, blocking specific domains as referrers is really no longer a valid approach.

However, as webmasters and Analytics practitioners, we can do a couple of things to help protect ourselves, and our data:

1. Block Known Bot’s and Spiders

Head over to your GA Admin Panel and select your view, towards the bottom part of the page, you will see a checkbox to instruct GA to do away with known bot and spider traffic:enable-robots-analytics-view
2. Create a filter to ‘include’ only your domains(s)

The majority of Google Analytics profiles only need to capture data from a single domain (or set of subdomains), so filtering your analytics data to only include your root domain name is a quick way to remove some of the SPAM.  To setup, visit your Google Analytics admin console and create a new filer for your view (of course, update the domain name to reflect your own).  In this example, I will be including data to: RichMcPharlin.com, www.RichMcPharlin.com, subdomain.RichMcPharlin.com etc.

exclude-other-domains

If you require a more advanced filter to cover a group of domains you’re tracking in aggregate (ie. mydomain1.com and mydomain2.com) you can use an advanced filter that looks like the following:

advaned-ga-domain-filter

These two solutions will help us clean up a percentage of domains in our referrer report, however it’s currently impossible to have an exhaustive approach – that is best solved by Google, who has access to all of our data and can much more accurately detect patterns across the tens of millions of sites currently using GA.

Published inTechnical Google Analytics

One Comment

  1. Anthony Congdon Anthony Congdon

    Hi Rich and thanks for this really handy article. Until recently I was almost at a point of filtering all referral traffic from my reports due to the overwhelming quantity if spam. That wsd until about two weeks back I when I saw a 100% removal of referral spam across a significant percentage of the GA accounts I manage. I was pretty delighted with this but more than a tad confused as to 1. Why it wasn’t all accounts and 2. How it had been applied to the data set retrospectively

    Keen to know of anyone else has seen anything similar.

Leave a Reply

Your email address will not be published. Required fields are marked *