Назад | Перейти на главную страницу

Как заблокировать плохих роботов, пауков, краулеров и комбайнов

Я устал от этих плохих роботов, пауков, краулеров и комбайнов. Я уже настроил свой сервер на блокировку IP-адресов на 5 минут и с maxretry 250 с помощью fail2ban. Но все же некоторые из них не могут быть перехвачены с помощью 250 maxretry, поскольку они не получают доступа к нему более чем 250 в течение 5 минут.

Вот мой конфиг jail.local:

[http-get-dos]
enabled = true
filter = http-get-dos
logpath = /var/log/ispconfig/httpd/*/access.log
maxretry = 250
findtime = 300
#ban for 10 hours
bantime = 36000
action = iptables-multiport[name=HTTP, port="http,https", protocol=tcp]
         cloudflare-blacklist
         sendmail-whois[name=HTTP, dest=webmaster@mysite.com]

Вот файл фильтра http-get-dos.conf:

[Definition]

failregex = ^<HOST> -.*"(GET|POST)

ignoreregex =

В большинстве руководств, которые могут заблокировать этот поисковый робот, используется apache. Но поскольку я использую nginx, я не могу их использовать. Вот один руководство Я нашел.

Вот пример журнала этого робота:

220.225.127.41 - - [24/Jul/2013:00:00:19 +0800] "GET /php?page=9 HTTP/1.1" 200 10897 "http://www.mysite.com/php?page=8" "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36"
220.225.127.41 - - [24/Jul/2013:00:00:22 +0800] "GET /sites/default/files/download/jkev/jkev_search.zip HTTP/1.1" 200 35199 "http://www.mysite.com/sites/default/files/download/jkev/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:00:00:24 +0800] "GET /sites/default/files/styles/thumbnail/public/images/kalola/sk_3.jpg?itok=-pXuOEq2 HTTP/1.1" 200 3958 "http://www.mysite.com/php?page=9" "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36"
220.225.127.41 - - [24/Jul/2013:00:00:24 +0800] "GET /sites/default/files/styles/thumbnail/public/images/kalola/sk_1.jpg?itok=ug6jsTPP HTTP/1.1" 200 3958 "http://www.mysite.com/php?page=9" "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36"
220.225.127.41 - - [24/Jul/2013:00:00:24 +0800] "GET /sites/default/files/styles/thumbnail/public/images/kalola/sk_2.jpg?itok=ZPOMnJeK HTTP/1.1" 200 3958 "http://www.mysite.com/php?page=9" "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36"
220.225.127.41 - - [24/Jul/2013:00:00:26 +0800] "GET /sites/default/files/styles/thumbnail/public/images/argie/currency.jpg?itok=hodqOr4_ HTTP/1.1" 200 7976 "http://www.mysite.com/php?page=9" "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36"
220.225.127.41 - - [24/Jul/2013:00:00:26 +0800] "GET /sites/default/files/styles/thumbnail/public/images/localhost27/untitled.jpg?itok=uVeczDjI HTTP/1.1" 200 3136 "http://www.mysite.com/php?page=9" "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36"
220.225.127.41 - - [24/Jul/2013:00:00:26 +0800] "GET /sites/default/files/styles/thumbnail/public/images/Oelasor/screenshot_11.jpg?itok=uu3d0GpX HTTP/1.1" 200 6674 "http://www.mysite.com/php?page=9" "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36"
220.225.127.41 - - [24/Jul/2013:00:00:27 +0800] "GET /sites/default/files/styles/thumbnail/public/images/localhost27/member.jpg?itok=inA9ULoC HTTP/1.1" 200 4500 "http://www.mysite.com/php?page=9" "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36"
220.225.127.41 - - [24/Jul/2013:00:00:28 +0800] "GET /php/4852/shopping-cart-checkout-using-codeigniter.html HTTP/1.1" 200 11414 "http://www.mysite.com/php?page=9" "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36"
220.225.127.41 - - [24/Jul/2013:00:00:29 +0800] "GET /sites/default/files/styles/medium/public/images/admin/codeigniter_shopping_cart.jpg?itok=QO0YV6JP HTTP/1.1" 200 22534 "http://www.mysite.com/php/4852/shopping-cart-checkout-using-codeigniter.html" "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36"
220.225.127.41 - - [24/Jul/2013:00:00:32 +0800] "GET /php/4846/simple-ajax-example-php.html HTTP/1.1" 200 10174 "http://www.mysite.com/php?page=9" "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36"
220.225.127.41 - - [24/Jul/2013:00:00:34 +0800] "GET /sites/default/files/download/teejaygenius/e_library.zip HTTP/1.1" 206 3655400 "http://www.mysite.com/sites/default/files/download/teejaygenius/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:00:00:36 +0800] "GET /sites/default/files/download/Chritian/bus.zip HTTP/1.1" 206 4462491 "http://www.mysite.com/sites/default/files/download/Chritian/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:00:00:37 +0800] "GET /sites/default/files/styles/medium/public/images/kalola/sk_2.jpg?itok=1N0a__bq HTTP/1.1" 200 9693 "http://www.mysite.com/php/4846/simple-ajax-example-php.html" "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36"e220.225.127.41 - - [24/Jul/2013:03:03:13 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 1555432 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"                                                                                                                             220.225.127.41 - - [24/Jul/2013:03:03:20 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 18541381 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:03:29 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 6186320 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:03:31 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 13495467 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:03:34 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 17908605 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:03:51 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 10082448 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:03:57 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 8639709 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:04:03 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 12150765 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:04:04 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 17972316 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:04:09 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 18453052 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:04:23 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 777716 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:04:40 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 8033075 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:04:45 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 12935983 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:04:49 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 8262600 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:04:49 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 11598966 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:04:49 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 11249310 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:04:57 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 5969210 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:05:02 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 12978641 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:05:03 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 13390784 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:05:07 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 6124786 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:05:15 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 9962834 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:05:19 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 12021359 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:05:27 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 8432875 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:05:44 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 18371964 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:05:46 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 19867749 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:05:50 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 18164900 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:06:00 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 17839100 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:06:01 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 18329973 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:06:11 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 18651902 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:06:31 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 9858200 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:06:34 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 12914955 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:06:36 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 13315966 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:06:38 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 12804285 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:06:41 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 6043976 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:06:42 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 11900897 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:06:52 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 8293782 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:07:06 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 11582412 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:07:24 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 18667357 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:07:27 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 7977266 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:07:35 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 11190040 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:07:36 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 18555860 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:08:09 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 5932064 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:08:10 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 12730175 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:08:13 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 13208853 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:08:16 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 8178860 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:08:22 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 5896753 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:08:25 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 8183834 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:08:26 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 12671818 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:08:30 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 18581925 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:08:36 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 18224268 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:08:37 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 11761743 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:08:51 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 11412627 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:08:59 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 18600749 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:09:01 +0800] "GET /sites/default/files/download/mindgamez/system1.zip HTTP/1.1" 206 11129155 "http://www.mysite.com/sites/default/files/download/mindgamez/" "FDM 3.x"
220.225.127.41 - - [24/Jul/2013:03:09:14 +0800] "GET /sites/default/files/download/argie/tameraplazainn.zip HTTP/1.1" 206 7836467 "http://www.mysite.com/sites/default/files/download/argie/" "FDM 3.x"

Вот частота посещений по часам:

# grep "220.225.127.41" /var/log/ispconfig/httpd/*/access.log | cut -d[ -f2 | cut -d] -f1 | awk -F: '{print $2":00"}' | sort -n | uniq -c
545 00:00
524 01:00
404 02:00
491 03:00
396 04:00
183 05:00

Вот частота посещений по минутам (около 12 часов ночи):

# grep "220.225.127.41 - - \[24/Jul/2013:00" /var/log/ispconfig/httpd/*/access.log | cut -d[ -f2 | cut -d] -f1 | awk -F: '{print $2":"$3}' | sort -nk1 -nk2 | uniq -c | awk '{ if ($1 > 10) print $0}'
33 00:00
14 00:01
12 00:03
26 00:05
15 00:10
18 00:11
22 00:13
15 00:14
14 00:15
15 00:18
21 00:19
17 00:20
15 00:23
14 00:24
17 00:25
27 00:29
15 00:30
18 00:32
14 00:52

Вот частота посещений по минутам (около часа ночи):

# grep "220.225.127.41 - - \[24/Jul/2013:01" /var/log/ispconfig/httpd/*/access.log | cut -d[ -f2 | cut -d] -f1 | awk -F: '{print $2":"$3}' | sort -nk1 -nk2 | uniq -c | awk '{ if ($1 > 10) print $0}'
16 01:01
16 01:02
12 01:05
16 01:06
14 01:10
14 01:11
14 01:12
13 01:14
22 01:16
18 01:17
13 01:21
21 01:22
14 01:26
20 01:37
30 01:38
13 01:45
11 01:50
17 01:51
11 01:53

Есть ли способ заблокировать это с помощью IPTables или чего-то еще?

Если я его снизлю, боюсь, что часть легитимного трафика также будет заблокирована.

Скорость доступа очень низкая. Я не могу установить maxretry примерно на 50 или даже 70. Это также запретит законный трафик.

Так как я могу предотвратить это? Они потребляют слишком много трафика. Раньше моя обычная пропускная способность составляла 59,31 ГБ в день, но теперь она достигает 136,74 ГБ.

Возможно, в качестве первого шага может помочь ограничение количества подключений (IPTABLES):

(из http://www.extrapepperoni.com/post/2013/03/iptables%3A-connlimit):

-A INPUT -j ACCEPT -p tcp --dport    80 -s xxx.yyy.0.0/16 --syn -m connlimit ! --connlimit-above 20
-A INPUT -j ACCEPT -p tcp --dport    80                   --syn -m connlimit ! --connlimit-above 5 --connlimit-mask 24

В основном это поможет против DDoS-атак, но может быть частью вашей проблемы: первое правило разрешает внутренним пользователям (из определенной сети) подключаться до 20 подключений. Второе правило позволяет всем остальным подключаться только к 5 подключениям за раз.

Более общий, но довольно сложный инструмент командной строки для формирования трафика - tc: http://www.tldp.org/HOWTO/html_single/Traffic-Control-HOWTO/

С помощью tc вы можете ограничить пропускную способность конкретного пользователя, службы или клиента.