I've seen this as well. If my DNS server becomes unavailable, even
temporarily, at least one nginx worker will crash, using 100% cpu time
and never responding to requests again. It used to be that my DNS
server would be automatically restarted, and this problem would not
necessarily affect all workers per incident, i.e. if I have 8 workers,
maybe 1 or 2 would permanently lock up when the DNS server fails. The
rest would continue processing requests once DNS became available, and
the crashed workers would spin 1 core cpu at 100% until I restart
nginx.
It would be handy if this were solved, also, it would make sense if
you could specify more than one resolver for nginx to use, or, if
nginx would default to using the resolvers in the /etc/resolv.conf.
Since we're on the subject, it seems the same behavior happens during
an upstream failure. nginx children are working fine, then an upstream
such as apache on the machine temporarily fails or is unresponsive,
then nginx stops working properly even after apache rights itself, and
the whole thing can only be solved by restarting nginx. In this case,
the nginx workers will either not respond to requests at all (sending
a blank page) or will respond 500 bad gateway, even though connecting
to the gateway directly on it's own port works fine, and restarting
nginx solves the issue.
Nginx has definitely been great for me, and if these two things didn't
tend to happen from time to time, I would consider it bulletproof.
Thanks,
Gabe
On Thu, Oct 22, 2009 at 11:22 AM, Maxim Dounin <mdounin@mdounin.ru> wrote:
> Hello!
>
> On Thu, Oct 22, 2009 at 02:02:14PM -0400, masom wrote:
>
>> Hi,
>>
>> I am currently planning to use nginx on several thousand devices as a reverse-proxy caching system.
>>
>> It currently work as expected (thanks Igor!), caches files as they are being requested by the devices.
>>
>>
>> The only problem we hit is when nginx starts faster than the dns sytem is available on the units. Nginx will crash saying it is unable to connect to the remote host being proxied.
>>
>>
>> 1739#0: host not found in upstream "content.dev.local" in /usr/local/nginx/conf/nginx.conf 33
>
> Crashes and refuses to start is quite a different things. As you
> have no DNS available during start - nginx just can't proceed any
> further since it doesn't know what your config means. Once
> started it won't depend on DNS anymore.
>
> To avoid such issues on start there are two basic options:
>
> 1. Use ip addresses in config instead of host names.
>
> 2. Make sure your OS resolving subsystem always returns meaningful
> results to nginx - either by launching nginx once DNS is available
> or by adding relevant entries to /etc/hosts.
>
> Maxim Dounin
>
>>
>>
>> Staring nginx again and it work (as the DNS is now responding properly).
>>
>>
>> Any idea on how to work around this or should i fill a bug report (Nginx shouldn't crash when the remote is not available, but should try on requests to access it).
>>
>> Nginx does not die when the remote drops and come back (by pulling the network cable for example). It only crash when nginx is launched and the dns sytem is not yet available.
>>
>> Posted at Nginx Forum: http://forum.nginx.org/read.php?2,15995,15995#msg-15995
>>
>>
>
>