Welcome! Log In Create A New Profile

Advanced

Re: cache manager process exited with fatal code 2 and cannot be respawned

November 09, 2012 03:08PM
Hi,

On Nov 9, 2012, at 23:36, Peer Heinlein <p.heinlein@heinlein-support.de> wrote:

> Am 09.11.2012 19:33, schrieb Isaac Hailperin:
>
>
>
> I did several hours of testing today with Isaac and there are two problems.
>
> PROBLEM/BUG ONE:
>
> First of all: The customer has 1.000 SSL-hosts on the nginx-Server, so
> he wants to have 1000 listeners on TCP-Ports. But the cache_manager
> isn't able to open so many listeners. He's crashing after 512 open
> listeners. It looks very much like the cache_manager doesn't read the
> worker_connections setting from nginx.conf.
>
> We configured:
>
> worker_connections 10000;
>
> there, but the cache_manager crashes with
>
> 2012/11/09 17:53:11 [alert] 9345#0: 512 worker_connections are not enough
> 2012/11/09 17:53:12 [alert] 9330#0: cache manager process 9344 exited
> with fatal code 2 and cannot be respawned
>
>
> I did some testing: Having 505 SSL-hosts on the Server (=505 listener
> sockets) everything's working fine, but 515 listener sockets aren't
> possible.
>
> It's easy to reproduce: Just define 515 ssl-domains having different
> TCP-ports for every domain. :-)
>
> Looks like nobody had the idea before, that "somebody" (TM) could run
> more then 2 times /24-network-IPs on one single host. In fact, this does
> not happen in normal life...
>
> But for historical reasons (TM) our customer uses ONE ip-address and
> several TCP-Ports for that so he doesn't have a problem running so many
> differend SSL-hosts on one system -- and this is the special situation
> where we can see the bug (?), that the cache_manager ignores the
> worker_connection-setting (?), when he tries to open all the listeners
> and relating cache-files/sockets.
>
> So: Looks like a bug? Who can help? We need help...
>
>
> PROBLEM/BUG TWO:
>
> Having 16 workers for 1000 ssl-domains with 1000 listeners, we can see
> 16 * 1000 open TCP-listeners on that system, because every worker open
> it's own listeners (?). When we reach the magical barrier of 16386 open
> listeners (lsof -i | grep -c nginx), nginx is running into some kind of
> file limitations:
>
> 2012/11/09 20:32:05 [alert] 9933#0: socketpair() failed while spawning
> "worker process" (24: Too many open files)
> 2012/11/09 20:32:05 [alert] 9933#0: socketpair() failed while spawning
> "cache manager process" (24: Too many open files)
> 2012/11/09 20:32:05 [alert] 9933#0: socketpair() failed while spawning
> "cache loader process" (24: Too many open files)
>
> It's very easy to see, that the limitation is based on 16.386 open files
> and sockets from nginx.
>
> But I can't find the place, where this limitation comes from. "ulimit
> -n" is set to 100.000, everything's looking fine and should work with
> many more open files then just 16K.
>
> Could it be, that "nobody" (TM) expected, that "somebody" (TM) runs more
> then 1000 ssl-hosts with different TCP-ports on 16 worker-instances and
> that there's some kind of SMALL-INT-problem in the nginx code? Could it
> be, that this isn't a limitation from the linux system, but from some
> kind of too small address-space for that in nginx?
>
> So: Looks like a bug? Who can help? We need help...
> Peer
>
>
> --
> Heinlein Support GmbH

Are you looking for a commercial support option to back up your customer's contract with an underpinning contract and vendor support?

I that's the case we've got our support options described here:

http://nginx.com/support.html

Hope this helps


> Schwedter Str. 8/9b, 10119 Berlin
>
> http://www.heinlein-support.de
>
> Tel: 030 / 405051-42
> Fax: 030 / 405051-19
>
> Zwangsangaben lt. §35a GmbHG:
> HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
> Geschäftsführer: Peer Heinlein -- Sitz: Berlin
>
> _______________________________________________
> nginx mailing list
> nginx@nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx

_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
Subject Author Posted

cache manager process exited with fatal code 2 and cannot be respawned

Isaac Hailperin November 07, 2012 04:50AM

Re: cache manager process exited with fatal code 2 and cannot be respawned

Isaac Hailperin November 08, 2012 08:56AM

Re: cache manager process exited with fatal code 2 and cannot be respawned

Isaac Hailperin November 08, 2012 10:56AM

Re: cache manager process exited with fatal code 2 and cannot be respawned

Igor Sysoev November 08, 2012 09:10AM

Re: cache manager process exited with fatal code 2 and cannot be respawned

Isaac Hailperin November 08, 2012 10:14AM

Re: cache manager process exited with fatal code 2 and cannot be respawned

Isaac Hailperin November 09, 2012 08:16AM

Re: cache manager process exited with fatal code 2 and cannot be respawned

Maxim Konovalov November 09, 2012 11:28AM

Re: cache manager process exited with fatal code 2 and cannot be respawned

Isaac Hailperin November 09, 2012 01:58PM

Re: cache manager process exited with fatal code 2 and cannot be respawned

Peer Heinlein November 09, 2012 02:38PM

Re: cache manager process exited with fatal code 2 and cannot be respawned

nginxorg November 09, 2012 03:08PM

Re: cache manager process exited with fatal code 2 and cannot be respawned

Peer Heinlein November 09, 2012 03:16PM

Re: cache manager process exited with fatal code 2 and cannot be respawned

nginxorg November 09, 2012 03:40PM



Sorry, only registered users may post in this forum.

Click here to login

Online Users

Guests: 129
Record Number of Users: 8 on April 13, 2023
Record Number of Guests: 421 on December 02, 2018
Powered by nginx      Powered by FreeBSD      PHP Powered      Powered by MariaDB      ipv6 ready