How to Block Spam Referrers like darodar.com from

2018-12-31 17:59发布

问题:

I have several websites that get daily around 5% of visits from spam referrers. There is one strange things I noticed about this referrers: they show in Google Analytics, but I cannot see them in my custom designed table where I insert all the visitors to the site, so I think that they only manipulate the GA code, never reaching the site itself.

If you follow their link, they redirect you to some affiliates link.

I don\'t know whether they have impact on my SEO/SERP, but I would like to get rid of them. May I do that via htaccess file?

One peculiar aspect is that I get visitors from different forum like pages. E.g.: forum.topic221122.darodar.com, forum.topic125512.darodar.com etc., so I would like to block the full darodar.com domain.

Besides darodar.com, there are also econom.co and iloveitaly.co that are bothering my stats. Can I block them all from htaccess?

回答1:

This blog post suggests that the spam referrers manipulate Google Analytics and never actually visit your site, so blocking them is pointless. Google Analytics offers filtering if you want to mitigate fake site hits.



回答2:

Most of the Spam in Google Analytics never access your site so you can\'t block them using any server-side solution.

Ghost Spam hits directly GA and usually shows up only for a few days and then disappear, that\'s why some people think they blocked them from the .htaccess file but is just coincidence.

This type of Spam is easy to spot since they use either a fake hostname or is not set. (See image below)

The other type, Crawlers like semalt, actually access your site and can be blocked from the .htaccess file, however, there are just a few of them.

So in summary, to stop spam in Google Analytics:

  • Crawlers: server-side solutions or filters in GA
  • Ghosts: ONLY filters in GA

The only efficient solution to prevent being hit by ghost spam is by making an include filter with all your valid hostnames.

First you need to make a REGEX with all the valid hostnames, something like this (you can find them on the network report)

yoursite\\.com|shoppingcart\\.com|translateservice\\.net

These are some examples; you might have more or fewer hostnames. Once you have the REGEX, follow the same steps as above and change this:

  • Go to the admin tab in Google Analytics
  • Select FILTER under the View Column > New Filter
  • Filter type Custom > Include > Filter Field Hostname
  • File Pattern Copy the hostname expression you built

For Crawlers you will have to create a different filter building an expression with all spammers

spammer1|spammer2|spammer3|spammer4|spammer5
  • Filter type Custom > Exclude > Filter Field Campaign source
  • File Pattern Copy the referral expression

Everytime you work with filters it is important that you keep an unfiltered view.

If you need detailed steps for this solutions you can check this complete guide about Spam in Google Analytics.

Guide to stop and remove All the spam in Google Analytics

Hope it helps.

Hostname report Example \"valid



回答3:

Yes you can block with .htaccess and actually you should do it.

Your .htaccess file could look like this:

<IfModule mod_setenvif.c>
# Set spammers referral as spambot
SetEnvIfNoCase Referer darodar.com spambot=yes
SetEnvIfNoCase Referer 7makemoneyonline.com spambot=yes
## add as many as you find

Order allow,deny
Allow from all
Deny from env=spambot
</IfModule>

When traffic comes from these sites, they are blocked with this .htaccess, so the HTML is never loaded and therefore GA script is not fired up (from these sites).

They try to collect traffic from you, once you see the incoming traffic in Google Analytics then trying to find out what is the source you go to that URL. It is harmless to your site, except your statistics are full of junk data.

Google Analytics should prevent this, the same way GMail prevents spam email.



回答4:

According to this entry, they are never visiting your site, they are faking HTTP request to GA using your UA-code. So, it seems it\'s pointless to block them using .htaccess or any other method, because they never actually enter to your site, they are only sending fake \"visit\" data to Google.



回答5:

We have found that using htaccess is a good way to stop these spams. I have implemented below solution on my clients site which is working really well so far. Best way is to stop them by contains clause, e.g. spam priceg.com check for priceg in referrer url.

Because many of these sites are creating sub domains and re hitting and when they tweak the url, hard coded conditions fail

RewriteCond %{HTTP_REFERER} (priceg) [NC,OR]
RewriteCond %{HTTP_REFERER} (darodar) [NC,OR]

It is explained in detail here



回答6:

apparently, this is done by a spammer by communicating directly with google analytics using your website\'s account ID. So they effectively tell google analytics they visited your page while in fact they never did. They identify themselves to analytics by means of an URL which THEY WANT YOU TO VISIT. So you see their traffic in google analytics and go check them out. They will have an amazon affiliate account hooked up and so they attempt to get a commission from your amazon purchases, for example.

so .htaccess did nothing for me when I was fighting this one; you need to create a filter which filters out things like (.*)/.darodar/.com

the real bad effect I have found from this is it invalidates my website statistics



回答7:

You can restrict access use .htaccess or by filtering ALL robot visits from being tracked by Google Analytics. If that doesn\'t work, setup Google Analytics filtering. More details on how to do that can be found here: http://www.wiyre.com/google-analytics-darodar-forum-spam-what-is-it/

They are Russian based but routing their spiders through China and the Philippines. Maybe it would be best to block the whole IP address at this point, they have multiple sub-domains.



回答8:

Blocking any bots at your web server level makes no sense - spammers are sending fake requests to Google Analytics web server. All they have to know is website domain name and Google Analytics ID linked to it. So you have to mask your Google Analytics ID at website code. For example, you can do like this at Google Analytics JS code:

ga(\'create\', \'UA-X\' + \'XXXXX\' + \'XX-X\', \'auto\');

Spammer\'s bot should be able to execute JS code to parse your Google Analytics ID after this change (and not so many bots will be able to do it).

https://nobodyonsecurity.com/security/fighting-google-analytics-referrer-spam



回答9:

.htaccess is not the best way. In my site I use GA, The option tracking information and then Reference exclusion list.

Regards!



回答10:

Lunametrics posted a nice article to solve this issue using Google Tag Manager: http://www.lunametrics.com/blog/2014/03/11/goodbye-to-exclude-filters-google-analytics/



回答11:

I think that the most effective way to avoid ghost spam is to add a custom dimension that let you know the site was indeed visited, because as we know they never visit the site.

ga(\'set\', \'dimension1\', \"Hey I\'m really here!!\");
ga(\'send\', \'pageview\');

You should simply add this lines in your pages and then add a filter to \"include\" only when the dimension has the expected value (\"Hey I\'m really here!!\") in this case



回答12:

I used these mod_rewrite methods for semalt:

RewriteCond %{HTTP_REFERER} ^http(s)?://(www\\.)?semalt\\.com.*$ [NC]
RewriteCond %{HTTP_REFERER} ^http(s)?://(.*\\.)?semalt\\.*$ [NC,OR]
RewriteCond %{HTTP_REFERER} ^https?://([^.]+\\.)*semalt\\.com\\ [NC,OR]

or with the .htaccess module mod_setenvif

SetEnvIfNoCase Referer semalt.com spambot=yes
SetEnvIfNoCase REMOTE_ADDR \"217\\.23\\.11\\.15\" spambot=yes
SetEnvIfNoCase REMOTE_ADDR \"217\\.23\\.7\\.144\" spambot=yes

Order allow,deny
Allow from all
Deny from env=spambot

I even created an Apache, Nginx & Varnish blacklist plus Google Analytics segment to prevent referrer spam traffic, you can find it here:

https://github.com/Stevie-Ray/referrer-spam-blocker/



回答13:

Filter future and historical ga spam of all types with the link provided. Hostname filtering is particularly easy.

https://www.ohow.co/ultimate-guide-to-removing-irrelevant-traffic-in-google-analytics/