An Actionable Guide to Stopping Referral Spam in Google Analytics -

An Actionable Guide to Stopping Referral Spam in Google Analytics

by | Sep 13, 2021 | Analytics

Ghost referral spam can be a primary reason which screws up the Google analytics data. In the past decade, several spammy sites can enter into your list of referrers. They include darodar, semalt, www.event-tracking.com and floating share buttons, and the like. Of course, there are a few things you hate.

Referral spam has become one of the most prominent problems which are becoming the cause of concern for webmasters. Though it seems that there is no foolproof solution to the problem, experts are offering some great solutions to help you keep track. A top SEO India Company will help you to deal with the disadvantage that arises due to referral spam present in your Google Analytics data.

It is essential to avoid spammers slipping through and giving you a nightmare. You need to ensure that these spammers cannot pass through the net and your stats are accurate and clean.

What is Google Analytics Spam?

Though spam seems to get worse day by day, it could be worse if Google was not doing anything to handle it. Though the Google team implements a measure every time, the spammers outsmart them to carry out their activities in another way.

Google Analytics is a free service and a highly reliable one. Everyone uses it. And this is the reason that it is so luring to the spammers as well. Google has to take every step of change in its core ways carefully so that it does not disrupt the other functionalities of Google. Therefore, Google has to do it all, thinking of its consequences for all the users.

Referrer Spam: 

It involves the fake URLs which are sent to Google Analytics. It attracts the people to the URL for promoting their services and products. In the worst cases, they will have a code that inserts malware.

Earlier, the spammers used only referrals to send fake data, but now the spammers have found their way through the reports for organic keywords, events, pages, and language.

Some Common Types of Spam

Crawler Spam

Crawler spam was the original spam that used bots for fake referrals. 

Crawler spam comprises spiders that are designed to navigate through sites and leave their fake referrals in the logs and Analytics of the site. When an analyst uses Google Analytics, they will naturally get attracted to the spammer site if they search for the information.

Semalt.com and its variations, uptime-alpha.net.

Crawler spam ignores rules like the robots.txt.

Moreover, crawler spam requires the spammer or uses more resources than ghost spam. So, real spam is less frequent than ghost spam.

Though you can use programs like htcaccess.com, web.config, WordPress plugins, the best solution is to use filters in GA. SEO experts working at SEO Services India tell you that you should go for segments.

Bot Direct Traffic

Though this is not spamming conventionally, bots or spiders can be good or bad. Due to their distinct characteristics, you will have to use different ways of dealing with the bots.

Ghost Spam

Ghost spam never accesses your site, and it comprises fake data sent to the Google Analytics server. Ghost spam just needs a code number to exist to be active. Few things you need to know about ghost spam:

The only way to stop ghost spam is by applying filters in Google Analytics.

  • You cannot use server-side solutions like htcaccess, web.config, and others.
  • One can find ghost spam in all reports, including referral, direct, organic, language, and event.
  • Spammers will mostly use ghost spam in place of the other kinds of spam.

Ghost Spam

The Action of Ghost Spam

Though ghost spam does not hit your site, it hits your analytics. How?

This is because it is made from a protocol that allows the developers to send data directly to the Google Analytics servers. It sends the data directly to the servers and determines how the users interact with the business in any environment.
Spammers generate several random codes having GA patterns (UA-XXXXXX-Y) alongside an automated script for sending fake data to thousands of properties.

Do You Think Ghost Spam can Never Access Your Site?

Most ghost spam, especially the most common and unwanted type of spam, will leave fake host names in your reports. You will be including just valid hostnames, which you want to leave out automatically. The hostname is the right place where anyone arrives on your list. It can also include the services if you are adding the GA code. Overall, to visit GA, you will need a source and a hostname.

Hostname: This is the most critical part of Google Analytics. The hostname is every place where any of your visits are found. It will generally be the domain where you can add your GA tracking code. So, every place in your Google Analytics will have a hostname and source. Top SEO Agency will help you to understand the intricacies of the project.
The source is where the visit originates, for example, organic, referral, direct, social.

Hostname

It is the destination where the visitor arrives, and it will generally be your domain.

For example:
The visit will be recorded in Google Analytics like this:
Hostname: www.ohow.co
Source (referral): facebook.com
Using the campaign source in place of referral
You may think it completely reasonable to use the referral files to use the option of ‘Exclude referrals.’ However, in place of this, it is best to use Campaign Source. The primary reason for this is that it is present in filtering referral of Analytics documentation.

Generally, a valid visit will come by a valid value of HTTP referrer. But, in the case of spam from some real sources, things happen differently. Spammers use UTM parameters to add source and medium. It appears as referrals that do not have a header for information on the HTTP header. That is why a filter and referral field do not work.

Ghost and Referral Traffic

Just like the rest of the internet, spam is continuously evolving. It is no longer a search engine issue, and rather it has found its way through Analytics. Moreover, the spammers will try to squeeze through the email box as they pick up flaws present in your system and show the same in your reports.

You will find a dim glimmer of hope which you will be wondering about, and you will be able to visit their website just out of curiosity. It can surely wreak havoc with your data. The fact remains that the bots will never visit your site. They need to tickle through the javascript using Google Analytics to tell you when you get a visitor.

You need to take care of your analytics numbers, including key stats for engagement and data stats.

If you want to make significant content management investments with these metrics, you need highly accurate.

That’s why ghost and referral spam has become a big concern. This kind of data can be a great issue for
Small businesses and solopreneurs.
Marketing agencies of all sizes
Medium-sized businesses with no dedicated marketers
You will find that the number of hits from spam is increasing day by day. Instead, the various sources have been listed and also eliminated. Many times, referral spammers will try to disguise themselves as Goggle.

How Does Referral and Ghost Spam Affect SERP?

Now, there is no direct effect of ghost spam on your search engine optimization. However, as your analytics data gets corrupted, it will impact your decision-making. You won’t be able to find out the parameter which is directly responsible for giving a positive impact on your SEO. Moreover, the data left by the spammer, like bounce rate, is average. Furthermore, you need not worry about session time.

Ghost Spam

New sites generally require a lot of struggle to get legitimate traffic. Moreover, if the spam percentages are higher, the skew data will be much more, especially if your site gets more than a thousand hits per day.
You can make the right decision about your website only if you have the clean data for making informed decisions for your website. However, in the first go, you will need to clean up this mess. Google Analytics rankings do not make sense for ranking factors for two critical reasons.

Google Analytics is used widely, but not many websites use it. So, they do not act as a suitable benchmark.
Secondly, you can easily manipulate the data in various ways. For example, people can fake the bounce rate to become nearly 0 by inserting multiple codes.

Does the Spam Analytics Data Represent a Security Issue?

The ghost spam can leave certain weird pages in Google Analytics, leaving an impression that the website is hacked. But, if the page does not open, you know it is a fake hit. However, if the website page opens, the chances are that site might have been hacked.

Spammers target your analytics with UA-XXXXXX-Y codes randomly. If you come on their list, you will be hacked.

What is the Purpose of Spam?

Spammers are using this blackhat technique to get a large volume of traffic on their site. The purpose of spam hits is to lure people to visit fake referrals. Ultimately, they may want to promote a page, sell you a service, get your email address or help you insert your site.

Detecting Spammy Traffic

It is important to do a little research before concluding that all spam is a referral. Make sure that you do not type the URL directly into the browser. In place of this, search like this:
suspicioussite.com / referral.

If you do not find any information about this, you will analyze the data left by spam. Ghost spam is easier to consider if all the data is fake. It is essential to check the name of the host referral.

It is easier to detect ghost spam when all the data is fake. Comparatively, it can be harder to detect the crawlers as they do not leave any other data.

The referral exclusion list of spam will help to exclude referrals. When you add spam to the list.

Are Things Quite Easy?

You will find that a single referrer record in the analytics of the single load page if someone is loading your page for the various assets like the images, javascript libraries, CSS, and tracking. Also, the ghost spammers will help you avoid all the mess and do away with the javascript tracking code that forges Google visits that do not show up.

Moreover, it took just 0.001 seconds per page to load on a server at any point. Furthermore, the page also loads 100 other pages on various sites to make their way into Google analytics accounts for everyone.

Though you may contemplate buying another web hosting site, it can be quite easy to take care of the intrusive links making their way into your Google Analytics account.

Long Story Short

Spammers have become highly active in recent times. And dealing with the problem of spammers requires diligent solutions. Moreover, they have excellent techniques to go past your solutions; they can work.

Several techniques considered as the right ones may not solve the problems. Htc.access file does not work when it comes to advanced tactics. In most cases, ghost spam does not touch upon your site, taking away the usefulness of this method.
You may go for referral inclusion or blocking list: You will have a site with a good setup but no blocking list.

Google Analytics provides the most important elements in a website’s decision-making process. SEO will influence your decision-making process and also determine the success or failure of your Ad campaigns. Taking the proper steps in your procedure, you will find more about the right approach and the unwanted steps you should eliminate in your digital marketing efforts like Ad campaigns and social media, which can easily determine the data’s accuracy.

Utilizing the powerful functionality of the Google Analytics filters, you can prevent junk traffic. The most important thing is to do it so that you do not risk losing your real data.

What Kind of Junk Data Do do You Need to Filter?

The commonest types of junk data you shouldn’t include in your Google Analytics include bots, crawler spam, internal traffic, ghost spam.

Where Will the Google Analytics Filters Work?

They will work in Joomla, Wix, Shopify, Weebly, Squarespace, and more. Moreover, they will work independently of CMS.
How frequently should you check for new threats?

If you constantly monitor new spam and bots 3-5 times a week, you will need to update expressions when you can detect significant threats.

Important Precautionary Steps

Creating your Google Analytics, there are few steps you need to think about in a prudent way. Make sure that you have two views. Have at least a view on where you will apply the filter and another one on what you don’t want to use the filter. Being extra cautious will help you to create a test view for testing the filters before applying them.

Now, you need to protect your data from misconfigurations

As the views are configured correctly, you will stop all the spam traffic from entering your Analytics data. It will help you to find out about the real performance of your site.

Keep in mind that no single solution will stop the complete junk in one go. Therefore, consistent work is crucial if you want to get accurate Analytics data.

Refrain from applying filters on raw filter data. Moreover, if you aren’t sure enough to use filters, use them in the test view in the first go.

Using Exclusion Lists and Filters

Such filters will exclude and block the incoming spam and will not work for referrers from the past. The exclusion filters come very close to the real solution. However, the actual problem can be very difficult to track and fix. Several creators of such lists have invested in giving you updated solutions. You require a lot of maintenance to keep the list prohibitive and offer effective solutions to the problems. This is especially needed if you do not find any profit to carry out the solution.

The Puzzle Piece

It is crucial to have practical and reasonable solutions that help you recognize and do away with the referral spam traffic. You need to work on very regularly updated data that is retroactive to the earlier data and has been sourced from a large database.

What are Segments to Block and Filter Spam?

Filters help you to block or include the data with the reporting data set. If you filter or block anything accidentally, the data will be gone forever. Moreover, they will not help the editing of the past data.

However, this is how you can use the filters to stop certain traffic from coming. If you add spam sites to the filters, you will exclude the hits from those websites. But, you have to keep the filters updated. However, many marketers are unable to recognize the spam sites. This is because many spammers are highly crafty in creating spam visits that look like real ones. It’s so tricky that spam hits come from sites that are seemingly highly reputable. It means that it was counted in your GA without anyone visiting your site.

So, you have to fish out the spam hits. And it is advisable to check for spam referrals for every amount. It will help to block any new bothersome sites from coming into the hence and manipulating the data.

You can find out the spam domains using the following process.

Acquisition>All Traffic>Referrals

Make sure that you can sort the results using bounce rate as you click the box on the top of the bounce rate column.
Gather all the spam domains present onto a spreadsheet, especially those with nonexistent session time and a 100% bounce rate. Now you just have to take action on the top of the screen as you click on the button of all filters.

You can also go for the spam domain filter and click on the custom column. In the next dropdown, you get several options. Choose Exclude. You can then move ahead and choose the Campaign source.

For example, if you want to block the two sites Spam Me and fake Views, you will enter spamme\.|fakeviews\.
As you do this time and again for different websites, you will get used to the process. Moreover, there is so much more to do in this. With more innovation, you can curb spam from getting to your site.

Block the spam referrers: A few well-known spam sites have a particular character limit for every series of blockers. That’s why you need so many filters.

Apply the filter in this way.

Go to Admin>All filters>+ ADD FILTER

Moreover, your screen should look familiar. You can also give your filter another name.
A subset of sessions or users will help you turn the segments on or off. Moreover, because they are not destructive, you can use them from past data.

Now use the filters custom-made to exclude the spam domains.

One proactive approach here is to block the common spam referrers who might be in their tracks to hit your GA.
In place of using filters for blocking spam traffic, it is better to use the segments so that you do not permanently end up deleting the data. Moreover, segments will also help apply the previously used data, and use them retroactively.

Using the Filters

First of all, you need to keep the filters updated. Many times, new spam sites show.
If you compile all the information of the spam domains, especially those with a 100% bounce rate and those with a non-existent time. You need to move to the Admin tab and then towards the top of the screen.
As you add the spam sites to the top of the screen, you would no longer find these sites. It is necessary to keep these filters updated.

Log in to GA and click to Acquisition>All Traffic>Referrals in the sidebar on the left.

After you click the referral, you will see a table like this.

Traffic Sources

From this data, you want to check the valid visits. Now, it is time to check and weed out the valid visits. Moreover, most spam is easy to spot. It is easily visible in the bounce rate column. In this table, the visit with the highest bounce rate was re-listed first. Most of the hits have the mark of spam. When you have a non-existent duration time, 100 % bounce rate, and 1 page per session, you need to eliminate it.

Building a Hostname Regex is Extremely Important

Having a list of all hostnames, you can create regular expressions having all of them. You must add all the relevant hostnames; otherwise, you may lose all the valid data. For example, regex for all domain types are:
tomrobakphotography\.com|cdn\.tomrobakphotography\.com|www\.tomrobakphotography\.com|sample\-domain\-tomrobak\.com

Few General Rules

Use a bar or a pipe character to separate each domain name.

The hyphen or the dot is also a special character and require adding a backslash before them.

Make sure that you do not leave any spaces.

The REGEX will have a limit of 255 characters.

Make sure that you do not add a bar or a pipe at the beginning, or the end of an expression.

If you are sure of the expression, create a filter to avoid the problem.

Also, enter valid hostnames as a name.

Make sure that you select the option of custom in the filter type.

Choose, include and select the hostname from the dropdown.

Now, copy and paste the expression name matching all the known crawler spam.

In the filter pattern, paste the following expression.

Expression #1

(best|dollar|success|top1)\-seo|(videos|buttons)\-for|anticrawler|^scripted\.|semalt|forum69|7makemon|sharebutton|ranksonic|sitevaluation|dailyrank|vitaly|profit\.xyz|rankings\-|dbutton|uptime(bot|check|\.com)

Expression #2

datract|hacĸer|ɢoogl|responsive\-test|dogsrun|tkpass|free\-video|keywords\-monitoring|pr\-cy\.ru|fix\-website|checkpagerank|seo\-2\-0\.|platezhka|timer4web|share\-buttons|99seo|3\-letter

Expression #3

This is done to counter language spam.

Follow the same steps but instead of “Campaign Source” select Language Settings.
\s[^s]*\s|.{15,}|\.|,

The next thing to do is to exclude the hit button from all bots and spiders. Various crawlers are not spam and are also not useful for your reports.

Moreover, there can be several crawlers reaching your site indexing. Many bots will leave a record in your Google Analytics report if it is not excluded.

Click in the reporting section to click the box saying all users on the top of the graph. Now, you need to click on the red button.
Now, go to the segment window, and nearly at the button, click conditions.

The first condition is to go for filter>sessions> include

Now go to dropdown one and then hostname. Go to the dropdown to matches regex.

Now go to the text box to paste the hostname expression used earlier for the filter. Then, add a filter to add a new condition.

The second condition is to use:

Filters>sessions>exclude

Go to source in dropdown 1

Then, to matches regex in dropdown 2

Use a textbox to paste the crawler spam expression

((best|dollar|success|top1)\-seo|(videos|buttons)\-for|anticrawler|^scripted\.|\-gratis|semalt|forum69|7make|sharebutton|ranksonic|sitevaluation|dailyrank|vitaly|profit\.xyz|rankings\-|dbutton|\-crew|uptime(bot|check|\.com)|datract|hacĸer|ɢoogl|responsive\-test|torrent\-to|magnet\-to|dogsrun|tkpass|free\-video|keywords\-monitoring|pr\-cy\.ru|fix\-website|checkpagerank|seo\-2\-0\.|platezhka|timer4web|share\-buttons|99seo|3\-letter.

Now click the button to the left of what you just configured. Now, enter the segment name and save.
You can now find spam-free reports when the spam is selected. The filters will begin to do their work subsequently.

Cleaning the Existing Spam

Now is the time to create a custom segment in your present GA reports for removing spam.

Navigate from acquisition>All Traffic>Channels

Filters

Click on the secondary dimension box, now enter source/medium.

Acquisition

Click ‘Add segment’ on top of this segment.

Channels

Now click conditions in the pane sidebar.

New Segment

Demographic

Now you come to something like this.

Conditions

You will find another dropdown where you will click behavior>hostname

Session

You will get another dropdown; now select matches regex.

Matches

Set the dropdowns to Hostname and matches regex.

Filters

Now, you need to copy and paste the strings below.

dailyrank|100dollars-seo|anticrawler|sitevaluation|buttons-for-website|buttons-for-your-website|-musicas*-gratis|best-seo-offer|best-seo-solution|savetubevideo|ranksonic|offers.bycontext|7makemoneyonline|kambasoft|medispainstitute

Hit the OR button again, and copy and paste this string:

127.0.0.1|justprofit.xyz|nexus.search-helper.ru|rankings-analytics.com|videos-for-your-business|adviceforum.info|video—production|success-seo|sharemyfile.ru|seo-platform|dbutton.net|wordpress-crew.net|rankscanner|doktoronline.no|o00.in

Hit the Or button again; you will set it again to Hostname and matches regex.

Apply this filter to the time frame and view. 

What Are the Steps to Prevent Future Spam?

Now, take steps to prevent future spam by adding filters to block the referral spam.

GA has a great capability of creating traffic to stop a few traffic sources from entering. Adding spam sites to your filters, you won’t see fake hits from the sites.

Make sure that you keep the filters updated. This is because new spam sites keep showing up. And checking a few times a week is important. This means that you have to keep creating new filters and applying them.

The other thing you need to do is create a copy of your Analytics. This means that you must back up everything before you make any changes.

Now, navigate to the Admin tab in the right-hand column. And then, click on the ‘view settings’.

Give a unique name to the new view. It can tell you better about the original view.

Settings

If something is not right with the new view, it will help if you switch back to the earlier version.

View Name

Add specific filters depending on the spam sites which have given you fake hits.

Moreover, first, you must find the spam domains following the process, which is as follows.

Acquisition>All Traffic>Referrals

Sort the results by with bounce rate clicking on the box present at the top of the Bounce Rate column.

Block the Spam Domains

You compile the various spam domains (having 100% bounce rates and zero session time). Go to the admin time at the top of the screen.

Now, select the All Filter in the column.

All Filters

As you reach to click the filter, you get a domain name filter. Next, you should click to custom.

In the drop-down menu, choose exclude. Now, again click on the drop-down and select campaign source. As you enter spam domains you need to block, you copy and paste them into the Filter Pattern box.  However, do it all in a specific manner.

This is What it Will Look Like

Now, block the common spam referrers pasting in a pattern of domain\.|domain\.

If you’ve got fake traffic from a few well-known spam sites. However, you have a good chance that various other sites are waiting to pounce on the same.

Moreover, you will also need around four filters to do it. There is also a character limit for the various series of blocked spam sites.

In the First Filter

Go to Admin>AllFilters>+ ADD FILTER.

As the screen appears, give a name to the filter.

You will need to click on custom and exclude options as you select the campaign from the drop-down of the filter field.

Exclude

Now copy and paste the following link in the filter pattern box

offer|free\-|share\-|mercedes|buy|cheap|googlsucks|benz|sl500|hulfington|buttons|

darodar|pistonheads|motor|money|blackhat|backlink|webrank|seo|phd|crawler|anonymous|\d{3}.*forum|porn|webmaster|flipboard|fl\.ru|mbca|ahrefs|game|\.io|^sex|^video

As you scroll through, click through All Website Data, then hit Save.

Follow the same process and apply it to the second and third filters.

dailyrank|100dollars-seo|anticrawler|sitevaluation|buttons-for-website|buttons-for-your-website|-musicas*-gratis|best-seo-offer|best-seo-solution|savetubevideo|ranksonic|offers.bycontext|7makemoneyonline|kambasoft|medispainstitute

127.0.0.1|justprofit.xyz|nexus.search-helper.ru|rankings-analytics.com|videos-for-your-business|adviceforum.info|video—production|success-seo|sharemyfile.ru|seo-platform|dbutton.net|wordpress-crew.net|rankscanner|doktoronline.no|o00.in

top1-seo-service.com|fast-wordpress-start.com|rankings-analytics.com|uptimebot.net|^scripted.com|uptimechecker.com

Blocking Third Party Traffic

Data

You can do another thing in your GA to lower the amount of spam traffic.

Choose the Include Mode in Place of the Exclude

Filter Field

Enter the domain name in the Filter Pattern box. Choose the All Website data and apply filters to the view section and click save.

As you move a step further, you will be blocking the traffic by country. If you are seeing many offenders from the same country, you can block the traffic from that country. But, if you aren’t careful in this case, it can be problematic.

Though, there is no foolproof and complete solution to eliminate spam. Maximum time, spammers will be finding a way to get their thing done.

That’s why you must regularly clean the analytics and ensure that good stuff is on its way.

Removing Referral Spam

There is a lot of talk about spam referral traffic, and you require doing a lot about it. While working for an agency, you will be monitoring the website traffic. The people using Google Analytics regularly will find a lot of spam in their GA report.

Spam traffic is not something new. However, the talk about it has recently picked up momentum and is becoming an annoyance for webmasters. There are various ways to remove it from your reports to get your analytics normal.

Here are Various Ways of Removing Referral Spam

You need to refrain from several things while combating referral spam. But, keep in mind that there are only a few things that seem to be effective solutions. You don’t need short-term fixes, which don’t help you in the long term.

Referral Exclusion List

Spam traffic shows up in the Channel Report like referral traffic. You just need to add various domains to your referral exclusion list. 

Admin>tracking>referral exclusion list

This is what will clean the data and help you in making informed decisions. Overall, it will improve the whole thing.

And this will make things easy!

The single referrer record in analytics comprises a loading page. It is as if someone loads the page alo with your page’s differences, including the CSS javascript, images libraries, and their tracks.

The ghost spammers will generally avoid creating a complete mess. They’ll resort to sparking off the single javascript tracking code to Google.

The tracking page load just takes 0.0001 seconds to load at certain places. You will use several techniques which do not go well.

You must act reasonably to get an effective solution, identify and take out the ghost and the referral spam. Moreover, it needs to be updated regularly and done retroactively for the past data and sourced from an extensive database.

One can use segments for filtering and blocking spam. Filters help you to block or include data with your reporting data set.

Limitations

There is no one solution you can get for your analytics data. But, with consistent effort, you find that your GA is highly clean. The methods and tools you use can help you supplement with various techniques. Moreover, they will cover the bases the right way.

If you get a great collection of a lot of spam data, you would only be reaching what may be just the tip of an iceberg.

Using innovation, you will understand the various techniques that help suppress the spam you do not need. Here are a few pointers you must always follow.

Turn on the exclude option for bots and spiders in Google Analytics.

Plus, add an inclusive hostname filter and a cookie to your site to cover the bases.

It lends you a clean analytics profile.

With clean analytics, you will get a lot of time for promoting and creating a tool.

If you achieve success using the inclusive hostname filter, it will give you a highly effective and long-term solution to maintain clean data.  

The Takeaway

Though there is no foolproof and complete way to maintain a clean analytics profile, investing a little time can help you eliminate 99% spam from your Google Analytics reports. This is an actionable and practical guide on removing referral spam successfully, investing a minimal amount of time and effort.

Navneet Singh

Navneet Singh is a young enthusiast who Loves Internet Marketing. A Software Engineer By Chance and Working as a CEO in SEO Experts Company India, one of the Top Rated SEO Agency in India.