February 21, 2014 11:46AM
Maxim Dounin Wrote:
-------------------------------------------------------
> Hello!
>
> On Fri, Feb 21, 2014 at 10:25:58AM -0500, rge3 wrote:
>
> > I havne't found any ideas for this and thought I might ask here. We
> have a
> > fairly straightforward proxy_cache setup with a proxy_pass backend.
> We
> > cache documents for different lengths of time or go the backend for
> what's
> > missing. My problem is we're getting overrun with bot and spider
> requests.
> > MSN in particular started hitting us exceptionally hard yesterday
> and
> > started bringing our backend servers down. Because they're crawling
> the
> > site from end to end our cache is missing a lot of those pages and
> nginx has
> > to pass the request on through.
> >
> > I'm looking for a way to match on User-Agent and say that if it
> matches
> > certain bots to *only* serve out of proxy_cache. Ideally I'd like
> the logic
> > to be: if it's in the cache, serve it. If it's not, then return
> some 4xx
> > error. But in the case of those user-agents, *don't* go to the
> backend.
> > Only give them cache. My first thought was something like...
> >
> > if ($http_user_agent ~* msn-bot) {
> > proxy_pass http://devnull;
> > }
> >
> > by making a bogus backend. But in nginx 1.4.3 (that's what we're
> running) I
> > get
> > nginx: [emerg] "proxy_pass" directive is not allowed here
> >
> > Does anyone have another idea?
>
> The message suggests you are trying to write the snippet above at
> server{} level. Moving things into a location should do the
> trick.
>
> Please make sure to read http://wiki.nginx.org/IfIsEvil though.

That seems to have done it! With a location block I now have...

location / {
proxy_cache_valid 200 301 302 30m;

if ($http_user_agent ~* msn-bot) {
proxy_pass http://devnull;
}

if ($http_user_agent !~* msn-bot) {
proxy_pass http://productionrupal;
}
}

That seems to work perfectly. But is it a safe use of "if"? Is there a safer way to do it without an if?

Thanks for the help!
-R
Subject Author Posted

Serve *only* from cache for particular user-agents

rge3 February 21, 2014 10:25AM

Re: Serve *only* from cache for particular user-agents

Maxim Dounin February 21, 2014 10:48AM

Re: Serve *only* from cache for particular user-agents

rge3 February 21, 2014 11:46AM

Re: Serve *only* from cache for particular user-agents

ajay February 21, 2014 12:00PM

Re: Serve *only* from cache for particular user-agents

rge3 February 21, 2014 01:13PM

Re: Serve *only* from cache for particular user-agents

Maxim Dounin February 21, 2014 12:20PM

Re: Serve *only* from cache for particular user-agents

Darren Pilgrim February 21, 2014 05:16PM



Sorry, only registered users may post in this forum.

Click here to login

Online Users

Guests: 177
Record Number of Users: 8 on April 13, 2023
Record Number of Guests: 421 on December 02, 2018
Powered by nginx      Powered by FreeBSD      PHP Powered      Powered by MariaDB      ipv6 ready