Quote:
Originally Posted by PornoPlopedia
What are the best software available to find and filter bot traffic ?
Thanks a lot
|
.htaccess can stop a lot of the low life bots,scrapers, etc. You get immediate results, though this took me a long time to test via trial and error. Here's a snippet from one of mine. A lot is cut out, this is only a snippet, immediately resolved some of my biggest issues. I allow more than the US on my sites, again, this is just a snippet and I've given just a few lines from each section
Code:
Options -indexes
ServerSignature Off
Options +FollowSymlinks
GeoIPEnable On
RewriteEngine On
SetEnvIf GEOIP_COUNTRY_CODE US AllowCountry
SetEnvIf GEOIP_COUNTRY_CODE CN AllowCountry
Allow from env=AllowCountry
# FORWARD CHINA TO A FITTING YOUTUBE VIDEO
<IfModule mod_rewrite.c>
RewriteCond %{ENV:GEOIP_COUNTRY_CODE} ^(CN)$
RewriteRule ^(.*)$ https://www.youtube.com/watch?v=SLMJpHihykI$1 [L]
</IfModule>
# NO PROXIES FORWARDERS BLANK REFS ETC
RewriteCond %{HTTP:VIA} !^$ [OR]
RewriteCond %{HTTP:FORWARDED} !^$ [OR]
RewriteCond %{HTTP:USERAGENT_VIA} !^$ [OR]
RewriteCond %{HTTP:X_FORWARDED_FOR} !^$ [OR]
RewriteCond %{HTTP:PROXY_CONNECTION} !^$ [OR]
RewriteCond %{HTTP:XPROXY_CONNECTION} !^$ [OR]
RewriteCond %{HTTP:HTTP_PC_REMOTE_ADDR} !^$ [OR]
RewriteCond %{HTTP:HTTP_CLIENT_IP} !^$
# ISSUE 403 / SERVE ERRORDOCUMENT
RewriteRule ^(.*)$ - [F]
RewriteCond %{REQUEST_METHOD} POST
RewriteCond %{HTTP_REFERER} !.*YOURWEBSITE.COM* [OR]
RewriteCond %{HTTP_USER_AGENT} ^$
RewriteRule (.*) http://%{REMOTE_ADDR}/$ [R=301,L]
# STARTS WITH WEB
RewriteCond %{HTTP_USER_AGENT} ^web(zip|emaile|enhancer|fetch|go.?is|auto|bandit|clip|copier|master|reaper|sauger|site.?quester|whack) [NC]
# ANYWHERE IN UA -- GREEDY REGEX
RewriteCond %{HTTP_USER_AGENT} ^.*(craftbot|download|extract|stripper|sucker|ninja|clshttp|webspider|leacher|collector|grabber|webpictures).*$ [NC]
RewriteCond %{HTTP_USER_AGENT} ^.*(BlogScope|Butterfly|DCPbot|discoverybot|domain|Ezooms|ImageSearcherFree).*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*(ips-agent|linkdex|MJ12|Netcraft|NextGenSearchBot|SISTRIX|Sogou|soso|TweetmemeBot|Unwind|Yandex).*$ [NC]
RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
RewriteCond %{HTTP_USER_AGENT} ^Navroad
# ISSUE 403 / SERVE ERRORDOCUMENT
RewriteRule .* - [F,L]
# IF THE UA STARTS WITH THESE
RewriteCond %{HTTP_USER_AGENT} ^(aesop_com_spiderman|alexibot|backweb|bandit|batchftp|bigfoot) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(black.?hole|blackwidow|blowfish|botalot|buddy|builtbottough|bullseye) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(cheesebot|cherrypicker|chinaclaw|collector|copier|copyrightcheck) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(cosmos|crescent|curl|custo|da|diibot|disco|dittospyder|dragonfly) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(drip|easydl|ebingbong|ecatch|eirgrabber|emailcollector|emailsiphon) [NC,OR]
# ISSUE 403 / SERVE ERRORDOCUMENT
RewriteRule .* - [F,L]
SetEnvIfNoCase User-Agent ^$ bad_bot
SetEnvIfNoCase User-Agent "^Download\ Demon" bad_bot
SetEnvIfNoCase User-Agent "^Download\ Devil" bad_bot
SetEnvIfNoCase User-Agent "^Download\ Wonder" bad_bot
SetEnvIfNoCase User-Agent "^dragonfly" bad_bot
SetEnvIfNoCase User-Agent "^Drip" bad_bot
# Vulnerability Scanners
SetEnvIfNoCase User-Agent "Acunetix" bad_bot
SetEnvIfNoCase User-Agent "FHscan" bad_bot
# Aggressive Chinese Search Engine
SetEnvIfNoCase User-Agent "Baiduspider" bad_bot
# Aggressive Russian Search Engine
SetEnvIfNoCase User-Agent "Yandex" bad_bot
<Limit GET POST HEAD>
Order Allow,Deny
Allow from all
Deny from env=bad_bot
</Limit>