Some time ago I wrote this module
https://github.com/wandenberg/nginx-trusted-proxy-resolver-module to
check when an access is done through the Google Proxy using reverse DNS +
DNS resolve and comparing the results to validate the access.
You can do something similar.
On Sun, Sep 25, 2016 at 11:58 PM, lists@lazygranch.com <lists@lazygranch.com
> wrote:
> I got a spoofed googlebot hit. It was easy to detect since there were
> probably a hundred requests that triggered my hacker detection map
> scheme. Only two requests received a 200 return and both were harmless.
>
> 200 118.193.176.53 - - [25/Sep/2016:17:45:23 +0000] "GET / HTTP/1.1" 847
> "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.
> html)" "-"
>
> For the fake googlebot:
> # host 118.193.176.53
> Host 53.176.193.118.in-addr.arpa not found: 3(NXDOMAIN)
>
> For a real googlebot:
> # host 66.249.69.184
> 184.69.249.66.in-addr.arpa domain name pointer
> crawl-66-249-69-184.googlebot.com.
>
> IP2location shows it is a Chinese ISP:
> 3(NXDOMAIN)http://www.ip2location.com/118.193.176.53
>
> Nginx has a reverse DNS module:
> https://github.com/flant/nginx-http-rdns
> I see it has a 10.1 issue:
> https://github.com/flant/nginx-http-rdns/issues/8
>
> Presuming this bug gets fixed, does anyone have code to verify
> googlebots? Or some other method?
>
> _______________________________________________
> nginx mailing list
> nginx@nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx
>
_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx