Welcome! Log In Create A New Profile

Advanced

Bot Blocking

Posted by SAH62 
Bot Blocking
October 06, 2013 09:31PM
I'm new to nginx so please pardon my ignorance. I've read the appropriate documentation, but I'm still having some trouble with a location directive. I'm trying to block (return HTTP 403) web spiders from accessing everything under /forum. Here's my config:

I create a map in nginx.conf:

http {
...
map $http_user_agent $is_bot {
default 0;
~*(crawl|Google|Slurp|spider|bingbot|tracker|click|parser|spider) 1;
}
...
}

I've created a local config file that lives at local/block-search.conf that uses the map:

if ($is_bot) {
return 403;
}

The config file for my server contains this:

location ^~ /forum/ {
include local/block-search.conf;
index index.php;
try_files $uri $uri/ =404;
location ~ ^/forum/(.+\.php)$ {
try_files $uri =404;
fastcgi_pass unix:/var/run/php5-fpm.sock;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
include /etc/nginx/fastcgi_params;
}
}

This seems to work properly (I get a 403) when I use curl to request /forum/, like this:

curl -A "AdsBot-Google (+http://www.google.com/adsbot.html)" http://www.mysite.org/forum/

but the request isn't blocked when I attempt to get a resource like "/forum/index.php" or "/forum/index.php?action=post;board=9.0". These requests produce an HTTP 200 and the GET succeeds. What am I doing wrong, and how can I catch everything under /forum/? Thanks...
Re: Bot Blocking
October 09, 2013 07:01PM
I think I have this figured out. The trick was in understanding how the "if" directive really works. This blog post was very helpful:

http://agentzh.blogspot.com/2011/03/how-nginx-location-if-works.html

I tried adding the "if" to my forum/php location

location ^~ /forum/ {
include local/block-search.conf;
index index.php;
try_files $uri $uri/ =404;
location ~ ^/forum/(.+\.php)$ {
include local/block-search.conf;
try_files $uri =404;
fastcgi_pass unix:/var/run/php5-fpm.sock;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
include /etc/nginx/fastcgi_params;
}
}

...and now requests of the form "forum/index.php" are rejected with a 403.
Sorry, only registered users may post in this forum.

Click here to login

Online Users

Guests: 262
Record Number of Users: 8 on April 13, 2023
Record Number of Guests: 421 on December 02, 2018
Powered by nginx      Powered by FreeBSD      PHP Powered      Powered by MariaDB      ipv6 ready