Welcome! Log In Create A New Profile

Advanced

Re: upstream - behavior on pool exhaustion

April 17, 2017 12:10PM
On Sat, Apr 15, 2017 at 03:55:20AM +0200, B.R. via nginx wrote:
> Let me be clear here:
> I got 6 active servers (not marked down), and the logs show 1 attempt on
> each. They all failed for a known reason, and there is no problem there.
> Subsequently, the whole pool was 'down' and the response was 502.
> Everything perfectly normal so far.
>
> ​What is unclear is the feature (as you classified it) of having a fake
> node named after the pool appearing in the list of tried upstream servers.​
> It brings confusion more than anything else: having a 502 response + the
> list of all tried (and failed) nodes corresponding with the list of active
> nodes is more than enough to describe what happened.
> The name of the upstream group does not corresponding to any real asset, it
> is purely virtual classification. It thus makes no sense at all to me to
> have it appearing as a 7th 'node' in the list... and how do you interpret
> its response time (where you got also a 7th item in the list)?
> Moreover, it is confusing, since proxy_pass handles domain names and one
> could believe nginx treated the upstream group name as such.

Without the six attempts, if all of the servers are unreachable (either
"down" or "unavailable" because they have failed previously) at the time
the request starts, what do you expect to see in $upstream_*?

> On Fri, Apr 14, 2017 at 10:21 AM, Ruslan Ermilov <ru@nginx.com> wrote:
>
> > On Fri, Apr 14, 2017 at 09:41:36AM +0200, B.R. via nginx wrote:
> > > Hello,
> > >
> > > Reading from upstream
> > > <https://nginx.org/en/docs/http/ngx_http_upstream_module.html#upstream>
> > > docs, on upstream pool exhaustion, every backend should be tried once,
> > and
> > > then if all fail the response should be crafted based on the one from the
> > > last server attempt.
> > > So far so good.
> > >
> > > I recently faced a server farm which implements a dull nightly restart of
> > > every node, not sequencing it, resulting in the possibility of having all
> > > nodes offline at the same time.
> > >
> > > However, I collected log entries which did not match what I was expected.
> > > For 6 backend nodes, I got:
> > > - log format: $status $body_bytes_sent $request_time $upstream_addr
> > > $upstream_response_time
> > > - log entry: 502 568 0.001 <IP address 1>:<port>, <IP address 2>:<port>,
> > > <IP address 3>:<port>, <IP address 4>:<port>, <IP address 5>:<port>, <IP
> > > address 6>:<port>, php-fpm 0.000, 0.000, 0.000, 0.000, 0.001, 0.000,
> > 0.000
> > > I got 7 entries for $upstream_addr & $upstream_response_time, instead of
> > > the expected 6.
> > >
> > > ​Here are the interesting parts of the configuration:
> > > upstream php-fpm {
> > > server <machine 1>:<port> down;
> > > server <machine 2>:<port> down;
> > > [...]
> > > server <machine N-5>:<port>;
> > > server <machine N-4>:<port>;
> > > server <machine N-3>:<port>;
> > > server <machine N-2>:<port>;
> > > server <machine N-1>:<port>;
> > > server <machine N>:<port>;
> > > keepalive 128;
> > > }
> > >
> > > ​server {
> > > set $fpm_pool "php-fpm$fpm_pool_ID";
> > > [...]
> > > location ~ \.php$ {
> > > [...]
> > > fastcgi_read_timeout 600;
> > > fastcgi_keep_conn on;
> > > fastcgi_index index.php;
> > >
> > > include fastcgi_params;
> > > fastcgi_param SCRIPT_FILENAME
> > > $document_root$fastcgi_script_name;
> > > [...]
> > > fastcgi_pass $fpm_pool;
> > > }
> > > }
> > >
> > > ​The question is:
> > > php-fpm being an upstream group name, how come has it been tried as a
> > > domain name in the end?
> > > Stated otherwise, is this because the upstream group is considered
> > 'down',
> > > thus somehow removed from the possibilities, and nginx trying one last
> > time
> > > the name as a domain name to see if something answers?
> > > This 7th request is definitely strange to my point of view. Is it a bug
> > or
> > > a feature?
> >
> > A feature.
> >
> > Most $upstream_* variables are vectored ones, and the number of entries
> > in their values corresponds to the number of tries made to select a peer.
> > When a peer cannot be selected at all (as in your case), the status is
> > 502 and the name equals the upstream group name.
> >
> > There could be several reasons why none of the peers can be selected.
> > For example, some peers are marked "down", and other peers were failing
> > and are now in the "unavailable" state.
> >
> > The number of tries is limited by the number of servers in the group,
> > unless futher restricted by proxy_next_upstream_tries. In your case,
> > since there are two "down" servers, and other servers are unavailable,
> > you reach the situation when a peer cannot be selected. If you comment
> > out the two "down" servers, and try a few requests in a row when all
> > servers are physically unavailable, the first log entry will list all
> > of the attempted servers, and then for the next 10 seconds (in the
> > default config) you'll see only the upstream group name and 502 in
> > $upstream_status, until the servers become available again (see
> > max_fails/fail_timeout).
> >
> > Hope this makes things a little bit clearer.
> > _______________________________________________
> > nginx mailing list
> > nginx@nginx.org
> > http://mailman.nginx.org/mailman/listinfo/nginx


--
Ruslan Ermilov
Assume stupidity not malice
_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
Subject Author Posted

upstream - behavior on pool exhaustion

B.R. via nginx April 14, 2017 03:44AM

Re: upstream - behavior on pool exhaustion

ru@nginx.com April 14, 2017 04:22AM

Re: upstream - behavior on pool exhaustion

B.R. via nginx April 14, 2017 09:58PM

Re: upstream - behavior on pool exhaustion

ru@nginx.com April 17, 2017 12:10PM

Re: upstream - behavior on pool exhaustion

B.R. via nginx April 18, 2017 08:42AM

Re: upstream - behavior on pool exhaustion

ru@nginx.com April 19, 2017 04:52AM

Re: upstream - behavior on pool exhaustion

B.R. via nginx April 19, 2017 11:28AM

Re: upstream - behavior on pool exhaustion

ru@nginx.com April 20, 2017 09:00AM

Re: upstream - behavior on pool exhaustion

B.R. via nginx April 21, 2017 07:56PM

Re: upstream - behavior on pool exhaustion

ru@nginx.com April 25, 2017 10:46AM

Re: upstream - behavior on pool exhaustion

ru@nginx.com May 18, 2017 07:12AM



Sorry, only registered users may post in this forum.

Click here to login

Online Users

Guests: 173
Record Number of Users: 8 on April 13, 2023
Record Number of Guests: 421 on December 02, 2018
Powered by nginx      Powered by FreeBSD      PHP Powered      Powered by MariaDB      ipv6 ready