Welcome! Log In Create A New Profile

Advanced

Re: [PATCH 21 of 31] Fix cpu hog with all upstream servers marked "down"

lanshun zhou
August 15, 2011 01:52PM
Do you use the upstream hash module in any of your active upstreams?

Can your provide the full upstream configuration ?

2011/8/15 Maxim Dounin <mdounin@mdounin.ru>

> Hello!
>
> On Mon, Aug 15, 2011 at 02:59:36PM -0000, Oded Arbel wrote:
>
> > Regarding the above mentioned patch (also quoted below), I
> > wanted to provide feedback on this:
> >
> > On my system, we have several reverse proxy servers running
> > Nginx and forwarding requests to upstream. Our configuration
> > looks like this:
> > upstream trc {
> > server prod2-f1:10213 max_fails=500 fail_timeout=30s;
> > server prod2-f2:10213 max_fails=500 fail_timeout=30s;
> > ...
> > server 127.0.0.1:10213 backup;
> > ip_hash;
>
> Ip hash balancer doesn't support "backup" servers (and it will
> complain loudly if you place "ip_hash" before servers). Could you
> please check if you still see the problem after removing backup
> server?
>
> > }
> >
> > We've noticed that every once in a while (about 5-10 times a
> > week) one of the servers gets into a state where an Nginx worker
> > starts eating 100% CPU and timing out on requests. I've applied
> > the aforementioned patch to our Nginx installation (release
> > 1.0.0 with the Nginx_Upstream_Hash patch) and deployed to our
>
> You mean the one from Evan Miller's upstream hash module, as
> available at http://wiki.nginx.org/HttpUpstreamRequestHashModule?
>
> > production servers. After a few hours, we started having the
> > Nginx workers on all the servers eat 100% CPU.
> >
> > Connecting with gdb to one of the problematic worker I got this
> > backtrace:
> > #0 0x000000000044a650 in ngx_http_upstream_get_round_robin_peer ()
> > #1 0x00000000004253dc in ngx_event_connect_peer ()
> > #2 0x0000000000448618 in ngx_http_upstream_connect ()
> > #3 0x0000000000448e10 in ngx_http_upstream_process_header ()
> > #4 0x00000000004471fb in ngx_http_upstream_handler ()
> > #5 0x00000000004247fa in ngx_event_expire_timers ()
> > #6 0x00000000004246ed in ngx_process_events_and_timers ()
> > #7 0x000000000042a048 in ngx_worker_process_cycle ()
> > #8 0x00000000004287e0 in ngx_spawn_process ()
> > #9 0x000000000042963c in ngx_start_worker_processes ()
> > #10 0x000000000042a5d5 in ngx_master_process_cycle ()
> > #11 0x0000000000410adf in main ()
> >
> > I then tried tracing through the running worker using the GDB
> > command "next", which said:
> > Single stepping until exit from function
> > ngx_http_upstream_get_round_robin_peer
> >
> > And never returned until I got fed up and broke it.
> >
> > I finally reverted the patch and restarted the service, and
> > continue to get this behavior. So my conclusion is that for my
> > specific problem, this patch does not solve it.
>
> Your problem is different from one the patch is intended to solve.
> The patch solves one (and only one) problem where all servers are
> marked "down" in config, clearly not the case you have.
>
> Maxim Dounin
>
> _______________________________________________
> nginx-devel mailing list
> nginx-devel@nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx-devel
>
_______________________________________________
nginx-devel mailing list
nginx-devel@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx-devel
Subject Author Views Posted

[PATCH 00 of 31] generic patch queue for 1.0.4

Maxim Dounin 3869 June 27, 2011 01:10PM

[PATCH 01 of 31] Cache: fix another "stalled cache updating" alert

Maxim Dounin 1022 June 27, 2011 01:10PM

Re: [PATCH 01 of 31] Cache: fix another "stalled cache updating" alert

Kirill A. Korinskiy 996 June 27, 2011 02:06PM

Re: [PATCH 01 of 31] Cache: fix another "stalled cache updating" alert

Kirill A. Korinskiy 1012 June 27, 2011 03:00PM

[PATCH 02 of 31] Fastcgi: fix fastcgi_param with "HTTP_"

Maxim Dounin 1037 June 27, 2011 01:10PM

[PATCH 03 of 31] Bugfix: https wasn't working on systems with 32-bit off_t

Maxim Dounin 1178 June 27, 2011 01:10PM

[PATCH 04 of 31] Upstream: fix request finalization if client timed out

Maxim Dounin 949 June 27, 2011 01:10PM

[PATCH 05 of 31] Upstream: properly allocate memory for tried flags

Maxim Dounin 1046 June 27, 2011 01:10PM

[PATCH 06 of 31] Complain on invalid log levels

Maxim Dounin 1275 June 27, 2011 01:10PM

[PATCH 07 of 31] Fix incorrect 201 replies from dav module

Maxim Dounin 1046 June 27, 2011 01:10PM

[PATCH 08 of 31] Fix double content when return is used in error_page redirection

Maxim Dounin 1119 June 27, 2011 01:10PM

[PATCH 09 of 31] Drop incorrect special case for return 204

Maxim Dounin 1080 June 27, 2011 01:10PM

[PATCH 10 of 31] Clear old Location header (if any) while adding new one

Maxim Dounin 1051 June 27, 2011 01:10PM

[PATCH 11 of 31] Better handle various per-server ssl options with SNI

Maxim Dounin 1198 June 27, 2011 01:10PM

[PATCH 12 of 31] Better handle late upstream creation

Maxim Dounin 965 June 27, 2011 01:12PM

[PATCH 13 of 31] Gzip filter: handle empty flush buffers

Maxim Dounin 1121 June 27, 2011 01:12PM

[PATCH 14 of 31] Fix connection drops with AIO

Maxim Dounin 905 June 27, 2011 01:12PM

[PATCH 15 of 31] Fix socket leak with "aio sendfile" and "limit_rate" directives

Maxim Dounin 1123 June 27, 2011 01:12PM

[PATCH 16 of 31] Correctly handle Content-Encoding set from perl

Maxim Dounin 901 June 27, 2011 01:12PM

[PATCH 17 of 31] Gzip static: "always" parameter in "gzip_static" directive

Maxim Dounin 1088 June 27, 2011 01:12PM

Re: [PATCH 17 of 31] Gzip static: "always" parameter in "gzip_static" directive

Zhu Qun-Ying 959 June 27, 2011 02:02PM

Re: [PATCH 17 of 31] Gzip static: "always" parameter in "gzip_static" directive

Maxim Dounin 1063 June 28, 2011 06:34AM

[PATCH 18 of 31] Memcached: memcached_gzip_flag directive

Maxim Dounin 1015 June 27, 2011 01:12PM

[PATCH 19 of 31] Mail: handle smtp multiline replies

Maxim Dounin 1001 June 27, 2011 01:12PM

[PATCH 20 of 31] Additional headers for proxy_ignore_headers/fastcgi_ignore_headers

Maxim Dounin 1121 June 27, 2011 01:12PM

[PATCH 21 of 31] Fix cpu hog with all upstream servers marked "down"

Maxim Dounin 951 June 27, 2011 01:12PM

Re: [PATCH 21 of 31] Fix cpu hog with all upstream servers marked "down"

Oded Arbel 984 August 15, 2011 11:00AM

Re: [PATCH 21 of 31] Fix cpu hog with all upstream servers marked "down"

Maxim Dounin 983 August 15, 2011 12:00PM

Re: [PATCH 21 of 31] Fix cpu hog with all upstream servers marked "down"

lanshun zhou 928 August 15, 2011 01:52PM

Re: [PATCH 21 of 31] Fix cpu hog with all upstream servers marked "down"

Oded Arbel 1012 August 15, 2011 11:10AM

Re: [PATCH 21 of 31] Fix cpu hog with all upstream servers marked "down"

Oded Arbel 1262 August 15, 2011 02:46PM

[PATCH 22 of 31] Cache: correctly set conf_file while adding paths

Maxim Dounin 1168 June 27, 2011 01:12PM

[PATCH 23 of 31] Upstream: fix proxy_store leaving temporary files for subrequests

Maxim Dounin 1265 June 27, 2011 01:12PM

[PATCH 24 of 31] Cache: fix sending of empty responses

Maxim Dounin 1037 June 27, 2011 01:14PM

[PATCH 25 of 31] Cache: fix sending of stale responses

Maxim Dounin 1163 June 27, 2011 01:14PM

[PATCH 26 of 31] Variables: honor no_cacheable for not_found variables

Maxim Dounin 1133 June 27, 2011 01:14PM

[PATCH 27 of 31] Core: protect from subrequest loops

Maxim Dounin 1054 June 27, 2011 01:14PM

[PATCH 28 of 31] Core: resolve various cycles with named locations and post_action

Maxim Dounin 1093 June 27, 2011 01:14PM

[PATCH 29 of 31] Autoindex: escape '?' in file names

Maxim Dounin 981 June 27, 2011 01:14PM

[PATCH 30 of 31] Autoindex: escape html in file names

Maxim Dounin 869 June 27, 2011 01:14PM

[PATCH 31 of 31] Unbreak build with embedded perl and --with-openssl

Maxim Dounin 905 June 27, 2011 01:14PM

Re: [PATCH 00 of 31] generic patch queue for 1.0.4

António P. P. Almeida 944 June 27, 2011 10:10PM

Re: [PATCH 00 of 31] generic patch queue for 1.0.4

Maxim Dounin 1127 June 28, 2011 10:40AM

Re: [PATCH 00 of 31] generic patch queue for 1.0.4

fanboy 976 June 28, 2011 01:48AM

Re: [PATCH 00 of 31] generic patch queue for 1.0.4

Maxim Dounin 1195 June 28, 2011 11:00AM



Sorry, you do not have permission to post/reply in this forum.

Online Users

Guests: 256
Record Number of Users: 8 on April 13, 2023
Record Number of Guests: 421 on December 02, 2018
Powered by nginx      Powered by FreeBSD      PHP Powered      Powered by MariaDB      ipv6 ready