Welcome! Log In Create A New Profile

Re: [PATCH] os/unix: don't stop old workers if reload failed

Forum List Message List New Topic Print View

Maxim Dounin

December 17, 2020 08:28AM

Hello!

On Thu, Dec 17, 2020 at 03:20:19PM +0800, kasei@kasei.im wrote:

> 在 2020-12-10星期四的 22:47 +0300，Maxim Dounin写道：
> > Hello!
> >
> > On Fri, Dec 04, 2020 at 12:29:51PM +0800, kasei@kasei.im wrote:
> >
> > > *Sorry for the format issue, resending it.
> > > Hello, We found sometimes nginx might failed to start new worker
> > > processes
> > > during reconfiguring, (the number of processes exceeds
> > > NGX_MAX_PROCESSES for
> > > example). In that case, the master process still send QUIT signal
> > > to old
> > > workers, leads to the situation that there is no worker processes
> > > accept socket
> > > from listening sockets.
> > > And we found actually there is a return code in os/win32's
> > > ngx_start_worker_processes funcation. It skip quiting workers if
> > > there is no
> > > worker processes be spawned. So I written this patch followed
> > > win32's
> > > implementation and tested it. Cloud you please check this patch to
> > > see if there
> > > is anything missed? Thanks very much.
> >
> > Thanks for the patch.
> >
> > The win32 implementation works this way because spawning new
> > processes on Windows might easily fail for multiple reasons as
> > there is no fork(). On the other hand, this is not expected to
> > ever happen on Unix systems, except may be hitting variouos
> > limits, either nginx own NGX_MAX_PROCESSES limit or system limits.
> > That is, on Unix systems starting worker processes more or less is
> > only expected to fail in case of severe misconfiguration.
> >
> > Further, I'm not sure that current win32 behaviour is a safe one,
> > as resulting state is inconsistent: worker processes does not
> > match master configuration. This probably doesn't mater for win32
> > as starting worker processes is anyway racy, but might make things
> > worse on Unix systems.
> >
> > And, as your own TODO mentions, the approach "do not stop old
> > worker processes if no new worker processes were started" won't
> > prevent degraded service if some worker processes were not able to
> > start and listening sockets with the reuseport option are used.
> >
> > Could you please clarify - are you trying to solve some problem
> > you are facing in practice? If yes, could you please provide some
> > more details?
> >
> Hello, thanks for your reply!
>
> Yes, I am tring slove a problem in production environment. In short,
> sometimes we reload nginx in a high frquency like 2-3 minutes, then
> some long connections may holds many shutting down workers. If the
> number of existing workers exceeds NGX_MAX_PROCESSES, nginx can't spawn
> new workers. Yes, I understand it's an abuse of nginx, and the
> frequency of reloading should be reduced. But I'm thinking it might
> better to let nginx works with old config (equals reloading failed)
> instead of stop accpeting new connections.

Ok, thanks for the details.

As previously suggested, it looks more like a misconfiguration to
me. And if we want to prevent it from happening by nginx, a
better solution might be to complain and fail earlier, before
applying the new configuration. For example, nginx might refuse
to reload if there are more worker processes configured than free
process slots available, similarly to how it currently refuses to
upgrade if previous binary upgrade isn't complete yet. I'm not
sure it worth the effort though.

In your particular case it may make sense to avoid reloading nginx
if some number of previous reloads weren't complete yet. E.g.,
you may want to modify your reload scripts (or procedures, if
reloading is manual) to avoid reloading nginx if there are, say,
more than 100 worker processes are still running.

Alternatively, you may want to configure worker_shutdown_timeout
(http://nginx.org/r/worker_shutdown_timeout) to a reasonable
value. It will make sure that old worker processes are terminated
in a reasonable time even if there are active connections.

--
Maxim Dounin
http://mdounin.ru/
_______________________________________________
nginx-devel mailing list
nginx-devel@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx-devel

Reply Quote

RSS

Subject	Author	Views	Posted
[PATCH] os/unix: don't stop old workers if reload failed	Anonymous User	419	December 03, 2020 11:16PM
Re: [PATCH] os/unix: don't stop old workers if reload failed	Anonymous User	132	December 03, 2020 11:30PM
Re: [PATCH] os/unix: don't stop old workers if reload failed	Maxim Dounin	122	December 10, 2020 02:48PM
Re: [PATCH] os/unix: don't stop old workers if reload failed	Anonymous User	143	December 17, 2020 02:22AM
Re: [PATCH] os/unix: don't stop old workers if reload failed	Maxim Dounin	141	December 17, 2020 08:28AM

Sorry, you do not have permission to post/reply in this forum.

Online Users

Guests: 238

Record Number of Users: 8 on April 13, 2023

Record Number of Guests: 421 on December 02, 2018