Hello,
using nginx-0.8.40 on Linux 2.6.32 I've encountered a problem with
nginx reloading, which leaves [nginx] processes in process state "H"
behind. As soon as the first of these processes is left behind, nginx
is unable to reload the workers and is unable to stop gracefully.
(happens with 0.8.36 and 0.8.37 too)
Sending signals to the H-state processes won't do anything, leaving no
other choice but to SIGKILL them.
This behavior is very seldom and I've tried to track it down while
having nginx logging with log level debug. It finally happened after
reloading nginx the 68th time.
The system was completely idle during the tests.
Note that I've first encountered this while using cron to reload the
nginx configuration every *5 minutes*. I'm not entirely sure if this is
triggered by some kind of race-condition, where the last nginx reload
was not completely finished and the next one is issued on top.
I doubt it though: The debug log I've created shows the full process
list after a reload. AFAIK nginx changes some proctitles while in
reload-state, which I could not find anymore just seconds after issuing
the reload signal. In addition, the system was idle (no requests...), so
the old workers should fade in no time.
The command used by cron to reload nginx is:
nginx -t && nginx -s reload
Steps to reproduce it (after stopping Cron and nginx):
# Step 1: Start up nginx as usual, eg. /etc/init.d/nginx start
# Step 2: Run the following script to issue the reload signal
# every 20 seconds:
------------------------------- snip ---------------------------------
#!/bin/bash
log=/var/log/nginx/error_log
echo "XXXXXXXXXX ------- SHOWING INITIAL NGINX PROCESSES -------
XXXXXXXXXX" >> "$log"
ps auxf 2>&1 | grep nginx >> "$log"
echo "XXXXXXXXXX ------- END OF INITIAL NGINX PROCESSES -------
XXXXXXXXXX" >> "$log"
i=1
while sleep 10
do
echo "XXXXXXXXXX ------- RELOAD $i ------- XXXXXXXXXX" >> "$log"
nginx -t && nginx -s reload
sleep 10
echo "XXXXXXXXXX ------- RELOAD $i complete - SHOWING NGINX
PROCESSES ------- XXXXXXXXXX" >> "$log"
ps auxf 2>&1 | grep nginx >> "$log"
echo "XXXXXXXXXX ------- END OF NGINX PROCESSES ------- XXXXXXXXXX"
>> "$log"
((i++))
done
------------------------------- snap --------------------------------
As far as I understand, the reload should be complete when the process
list is appended to the error_log 10 seconds after reloading nginx
*and* it does not contain any proctitles like "nginx: reloading
workers..".
# Step 3: watch -n1 ps auxf ... and wait until a nginx process in
# state "H" appears.
# Step 4: CTRL-C the debug script, stop nginx and SIGKILL all "H"-state
processes.
# Step 5: Examine the error_log.
This debug log is about 21M in size, gzipped down to 1M. I've uploaded
it here to not pollute the mailing list:
http://biz.baze.de/files/nginx-debug.log.gz
The first appearance of an "H"-state process is after reload 68 on line
number 300630. From there on, the nginx reload is broken and more of
these processes accumulate.
To find the next marker, just search for "XXXXXXXXXX". Search for
" H " (leading and trailing space) to find the next process list with
such a process.
For the record, here is the process list after reload 68, with broken
process (pid 11488).
XXXXXXXXXX ------- RELOAD 68 complete - SHOWING NGINX PROCESSES -------
XXXXXXXXXX
root 30864 0.0 0.0 12780 1344 pts/9 S+ 13:27 0:00
\_ /bin/bash ./nginx-debug.sh
root 11894 0.0 0.0 9800 860 pts/9 S+ 13:50
0:00 \_ grep nginx
root 11895 0.0 0.0 12780 496 pts/9 S+ 13:50
0:00 \_ /bin/bash ./nginx-debug.sh
root 30776 0.1 0.0 101668 11332 ? Ss 13:27 0:02 nginx:
master process /usr/sbin/nginx -c /etc/nginx/n…
nginx 11488 0.0 0.0 0 0 ? H 13:49 0:00 \_
[nginx]
nginx 11835 0.0 0.0 101668 11876 ? S 13:50 0:00 \_
nginx: worker process
nginx 11836 0.0 0.0 101668 12096 ? S 13:50 0:00 \_
nginx: worker process
nginx 11837 0.0 0.0 101668 11008 ? S 13:50 0:00 \_
nginx: cache manager process
nginx 11838 0.0 0.0 101668 10988 ? S 13:50 0:00 \_
nginx: cache loader process
XXXXXXXXXX ------- END OF NGINX PROCESSES ------- XXXXXXXXXX
Here is the configuration:
------------------------------- snip --------------------------------
user nginx nginx;
daemon on;
worker_processes 2;
events {
worker_connections 5000;
use epoll;
}
# [ debug | info | notice | warn | error | crit ]
error_log /var/log/nginx/error_log debug;
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
# Server config
server_tokens on;
ignore_invalid_headers on;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
index index.html index.htm;
# Autoindex
autoindex off;
autoindex_exact_size off;
autoindex_localtime on;
# GZIP configuration
gzip on;
gzip_types text/plain application/x-javascript text/xml
text/css; gzip_disable "MSIE [1-6]\.(?!.*SV1)";
# Rate Limiting
limit_rate_after 20;
limit_rate 2m;
# File Cache
open_file_cache max=2000 inactive=60s;
open_file_cache_valid 60s;
open_file_cache_errors on;
# Proxy Setup
proxy_redirect off;
server_name_in_redirect off;
proxy_max_temp_file_size 16m;
proxy_intercept_errors off;
client_max_body_size 10m;
client_body_buffer_size 128k;
# Proxy Cache
proxy_buffering on;
proxy_buffers 32 4k;
proxy_cache_path /var/cache/nginx
levels=2:2:2
keys_zone=cache:64m
inactive=600
max_size=4096m;
proxy_cache_key "$host$request_uri";
proxy_cache_valid 200 301 302 404 1m;
# Proxy Headers
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
# TIMEOUTS
client_header_timeout 30;
client_body_timeout 30;
send_timeout 30;
proxy_connect_timeout 10;
proxy_send_timeout 60;
proxy_read_timeout 60;
keepalive_timeout 0;
# DEFAULT SERVER
server {
listen 80 default;
server_name _;
index index.html;
root /var/www;
open_file_cache off;
}
include /etc/nginx/upstreams-php.d/*;
include /etc/nginx/upstreams-wsgi.d/*;
include /etc/nginx/vhosts.d/*;
}
------------------------------- snap --------------------------------
The included files are upstream{} and server{} blocks with valid
syntax. They do not enable/configure any additional module.
$ nginx -V
nginx version: nginx/0.8.40
configure arguments: --prefix=/usr --sbin-path=/usr/sbin/nginx
--conf-path=/etc/nginx/nginx.conf
--error-log-path=/var/log/nginx/error_log --pid-path=/var/run/nginx.pid
--lock-path=/var/lock/nginx.lock --user=nginx --group=nginx
--with-cc-opt=-I/usr/include --with-ld-opt=-L/usr/lib
--http-log-path=/var/log/nginx/access_log
--http-client-body-temp-path=/var/tmp/nginx/client
--http-proxy-temp-path=/var/tmp/nginx/proxy
--http-fastcgi-temp-path=/var/tmp/nginx/fastcgi
--with-http_realip_module --without-http_charset_module
--without-http_ssi_module --without-http_userid_module
--without-http_geo_module --without-http_map_module
--without-http_split_clients_module --without-http_referer_module
--without-http_fastcgi_module --without-http_memcached_module
--without-http_empty_gif_module --without-http_browser_module
--without-http_uwsgi_module --with-debug
Let me know if you need anything else.
Best regards,
John
_______________________________________________
nginx mailing list
nginx@nginx.org
http://nginx.org/mailman/listinfo/nginx