I bet that if you’ve gone into Google Analytics either on your own website or a client’s, you’ve seen that you get a significant amount of referral traffic from strange sites like semalt.com, buttons-for-websites.com, and others. To get there, go your GA account, and go to Acquisition -> All Traffic -> Referrals. There you’ll see all the clicks that you get from other sites.
Some of this is legitimate click traffic, social media sites like LinkedIn, Facebook, and so on. Probably others like Chambers of Commerce or other orgranizations where you might get clicks. That’s all good.
But why are you getting clicks from buttons-for-websites.com? These are spam phantom “clicks”, and they’re completely
bogus. It’s annoying to see a bunch of traffic from these sites. The reason for their existence is that they want you to go look at THEIR website and do whatever they want you to do. (Don’t do it by the way – don’t give them the satisfaction!) Unfortunately, more of these spam sites are starting to pop up and multiply, so it can add to a significant chunk of your traffic.
Assuming you want to remove this phantom traffic from your Google Analytics, there are a couple ways to do it.
Using the HTACCESS File To Send Spam Referral Traffic Into A Black Hole
If you’re running WordPress, Joomla, or Drupal, which are all PHP websites that run on Linux (typically), you can edit a special file that lives in the top-level directory of your website. This file is special instructions to the web server before it serves up any page to anyone.
Connect to your website with your ftp tool and look for a file called .htaccess in the main (parent) directory of the website. Download a copy of the file to your computer and rename it to .htaccess-works (so you have a backup copy). Download the one from the website again, and edit the second one. If you goof up the .htaccess file, you can crash your website, so ALWAYS keep a backup copy of one that you know works.
Now open the .htaccess file with a text editor like Notepad (PC’s only) or Dreamweaver (don’t use MS Word – it isn’t a text editor and it’ll mess up your file).
Copy and paste the following lines into the file at the end:
RewriteCond %{HTTP_REFERER} semalt.com [NC,OR]
RewriteCond %{HTTP_REFERER} buttons-for-website.com [NC,OR]
RewriteCond %{HTTP_REFERER} seoanalyses.com [NC]
RewriteRule .* – [F]
For each site you want to add, just insert a new one of these and change the web address to whatever you want:
but put it above the last [NC] line.
Save the file, and upload it to your website. Make sure you test your website, and if you get an Error 500, don’t panic. Just upload the WORKING copy of the .htaccess file (the original) and rename it back. That’s why you keep a backup.
Assuming it works, these magic lines in your .htaccess file are basically catching a request to load your web site from the “REFERRER” domain, and if it catches one, then it sends them to a black hole. *snicker*
Exclude The Domains from Google Analytics
If you’re not feeling too confident about that, or you’re not running a website on WordPress, you can just tell Google Analytics to exclude the traffic completely from your reports. You must be running the Universal Google Analytics code for this to work. To find out if you are, open your home page and view the html source code (in Chrome and Firefox press [Ctrl][U], in IE right-click on the web page and choose “View Source”).
Press [Ctrl][F] to bring up a search box, and search for Google. Look for code that looks like this:
<script type=”text/javascript”>
var gaJsHost = ((“https:” == document.location.protocol) ? “https://ssl.” : “http://www.”);
document.write(unescape(“%3Cscript src='” + gaJsHost + “google-analytics.com/ga.js‘ type=’text/javascript’%3E%3C/script%3E”));
</script>
<script type=”text/javascript”>
try {
var pageTracker = _gat._getTracker(“UA-xxxxxx-xx”);
pageTracker._trackPageview();
} catch(err) {}</script>
If you see the google-analytics.com/ga.js line, that’s the OLD code, and you’ll have to get your web person to upgrade you to the new Universal GA code. Some WordPress Analytics plug-ins have a checkbox that turns on the Universal code, so you can look for that. The Universal code looks like this:
<script type=”text/javascript”>
(function(i,s,o,g,r,a,m){i[‘GoogleAnalyticsObject’]=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,’script’,’//www.google-analytics.com/analytics.js‘,’ga’);
ga(‘create’, ‘UA-xxxxxx-xx’, ‘auto’);
ga(‘send’, ‘pageview’);
</script>
If you have that (look for the /analytics.js line), then eliminating the spam referral clicks is easy.
Go into your Google Analytics, and click the “Admin” tab. In the middle column under “Property”, click the “.js Tracking Info” button, then “Referral Exclusion List”. For each site you want to remove, just add them in there.
Boom! Done!
Update August 17, 2015
OK, I’ve heard from a couple people now that some of this information is incorrect. I appreciate the feedback on that.
It turns out that the Google Analytics Referral Exclusion List is NOT a good way to go. It’s actually intended for different traffic – specifically for people who have a shopping cart (say in 1shoppingcart or PayPal), and clicks back from those sites (like after a purchase has been completed) counts as a referral click. The Referral Exclusion List makes those come back as just a direct click, but does NOT remove it from the Analytics data. So your data will still be messed up.
Read Why You Should Not Use the Referral Exclusion List for Spam.
Secondly, it seems that a lot of the spam traffic is actually phantom traffic. They don’t actually visit your website! They’re firing off the Google Analytics JavaScript code with random GA accounts remotely, which is causing it to be recorded in your GA account. So the .htaccess method works up to a point, but doesn’t stop the phantom visits.
It seems that the only effective way to remove the spam visits is to add filters to your Google Analytics account to filter out all the spam domains. It’s a little tricky to do this, because you can erase all your data if you screw it up. You want to tread lightly. Here’s a detailed, step-by-step article that takes you through the process of getting this set up: Definitive Guide to Removing Referral Spam. Not for the faint of heart. It would be nice if Google had a simpler solution in the GA tool itself.
Also see the comment below from Werner Bastianen who has a tool that may help simplify setting up the filters as well. We haven’t tested it, so please contact him directly if you have questions.
Great info Tom! Referral spam is a pain in my backside at the moment, so perfect timing. Do you recommend one of these options over the other?
Hi Sue, either one works. The .htaccess route is a bit more work to implement. So if you’re already running the Universal version of Google Analytics, it’s certainly easier to just drop it in the Property settings there while you’re already looking at the GA account. Either way, you have to keep up with all the new ones that keep popping up.
Hi Tom! Very helpful and indeed very precise piece of information. I work as a PPC manager and I can understand the pain of such referral traffic. It gets really important to avoid it somehow but some people tend to ignore it, voluntarily or involuntarily.But with such good strategies, it can be done without any hassle. Great work. Would appreciate if you can post some more work on this topic.Cheers.
Hi Ovais, Please reference the update to the blog post above with some additional information. I hope that’s helpful.
Tom
Hi Sue and Thomas,
Blocking through .htaccess has been an outdated solution too, since ghost traffic isn’t visiting your website but randomly hitting your GA. I’ve even heard people who are having ghost traffic on GA views of website that we’re even live yet (only the GA was created in advance).
A neat solution is http://www.referrerspamblocker.com that automates the creation of filters for all known spammers and for all your accounts and views at once. This works in any type of GA and we also provide segments to filter spam from your historical data that’s already skewed.
Please let me know what you think!
Cheers, Werner
Hi Werner, Thanks very much for the information on your GA filter tool. We’ll check it out.
Tom
Actually there is a simpler solution for this: filter by hostname.
Spammers are inherently lazy. To match your ID to the proper hostname they would have to crawl your site instead of just generating random tags. Not that they may never do this, but for now it seems like the most straightforward tactic.
https://moz.com/blog/stop-ghost-spam-in-google-analytics-with-one-filter
Thank you Jan for the reference. I’m learning that this is much more complicated and obnoxious than I originally thought.
Tom, in the view settings of Analytics there’s been a tick box for “Exclude all hits from known bots and spiders” for about a year. I only noticed it recently and it seemed to be deselected by default. Now it appears as though it is showing as selected by default. It seems to filter a lot of the bots but not all of them. Have you done any testing of this method?
Hi Ross, This is not intended to filter those sites from GA. The checkbox is to remove known bots and spiders from your GA data, which may actually skew the traffic data pretty significantly. A friend of mine, Feras Alhlou, who is an expert in GA wrote this article: Bots and Spiders Causing Unusual Spikes in Google Analytics Traffic that explains it all. He also recommends that you check the checkbox, but don’t do it in your default (raw data) view. Create a filtered view so you can always get back to all the data if you need it.
Hey
Your article is simply amazing.
Thank you for sharing such information with us.
Sir, I’m not able to find .htaccess on my website!! From where I can find this file on screen !! Im confused
Hi Siddhant, you may not have an .htaccess file. If your website is running on a Windows IIS server, you probably won’t. Usually the .htaccess file is only used on Linux/UNIX servers.