Welcome! Log In Create A New Profile

Advanced

Re: limit_req for spiders only

Toni Mueller
October 14, 2013 10:04AM
Hello,

On Mon, Oct 14, 2013 at 09:25:24AM -0400, Sylvia wrote:
> Doesnt robots.txt "Crawl-Delay" directive satisfy your needs?

I have it already there, but I don't know how long it takes for such a
directive, or any changes to robots.txt for that matter, to take effect.
Observing the logs, I'd say that this delay between changing robots.txt
and a change in robot behaviour would take several days, as I cannot see
any effects so far.

> Normal spiders should obey robots.txt, if they dont - they can be banned.

Banning Google is not a good idea, no matter how abusive they might be,
and they incidentically operate one of those robots which keep hammering
the site. I'd much prefer a technical solution to enforce such limits,
over convention.

I'd also like to limit the request frequency over an entire pool, so
that I can say "clients from this pool can make requests only with this
fequency, combined, not per client IP", because it doesn't buy me
anything if I can limit the individual search robot to a decent
frequency, but then get hammered by 1000 search robots in parallel, each
one observing the request limit. Right?


Kind regards,
--Toni++

_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
Subject Author Posted

limit_req for spiders only

Toni Mueller October 14, 2013 08:00AM

Re: limit_req for spiders only

Sylvia October 14, 2013 09:25AM

Re: limit_req for spiders only

Toni Mueller October 14, 2013 10:04AM

Re: limit_req for spiders only

Francis Daly October 14, 2013 10:24AM

Re: limit_req for spiders only

Toni Mueller October 14, 2013 10:54AM



Sorry, only registered users may post in this forum.

Click here to login

Online Users

Guests: 294
Record Number of Users: 8 on April 13, 2023
Record Number of Guests: 421 on December 02, 2018
Powered by nginx      Powered by FreeBSD      PHP Powered      Powered by MariaDB      ipv6 ready