(note - this is reposting of - http://forum.nginx.org/read.php?11,127694)
Hi All,
After debugging for 6-hours - I am giving this up :|
We have a nginx+php-fpm+mysql in LAN with almost 100 wordpress (created and used by different designers/developers all working on test wordpres setup)
We are using nginx without any issues from long.
Today, all of a sudden - nginx started returning [b]"504 Gateway Time-out"[/b] out of the blue...
I checked nginx error log for a virtual host...
[code]
2010/09/06 21:24:24 [error] 12909#0: *349 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET /favicon.ico HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 21:25:11 [error] 12909#0: *349 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET /favicon.ico HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 21:25:11 [error] 12909#0: *443 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET /info.php HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 21:25:12 [error] 12909#0: *443 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET /favicon.ico HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 22:08:32 [error] 12909#0: *1025 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 22:09:33 [error] 12909#0: *1025 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET /favicon.ico HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 22:09:40 [error] 12909#0: *1064 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET /info.php HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 22:09:40 [error] 12909#0: *1064 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET /favicon.ico HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 22:24:44 [error] 12909#0: *1313 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
2010/09/06 22:24:53 [error] 12909#0: *1313 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 192.168.0.1, server: rahul286.rtcamp.info, request: "GET /favicon.ico HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "rahul286.rtcamp.info"
[/code]
As I run php-fpm on port 9000 via TCP mode, I ran "netstat | grep 9000" and noticed something unusual...
[code]
tcp 1321 0 localhost:9000 localhost:46643 CLOSE_WAIT -
tcp 9 0 localhost:9000 localhost:36087 CLOSE_WAIT 14272/php5-fpm
tcp 1257 0 localhost:9000 localhost:46664 CLOSE_WAIT -
tcp 0 0 localhost:46666 localhost:9000 ESTABLISHED 12909/nginx: worker
tcp 9 0 localhost:9000 localhost:36100 CLOSE_WAIT 14276/php5-fpm
tcp 1321 0 localhost:9000 localhost:46629 CLOSE_WAIT -
tcp 9 0 localhost:9000 localhost:36106 CLOSE_WAIT 14275/php5-fpm
tcp 1257 0 localhost:9000 localhost:46656 CLOSE_WAIT -
tcp 9 0 localhost:9000 localhost:36091 CLOSE_WAIT 14277/php5-fpm
tcp 1321 0 localhost:9000 localhost:46625 CLOSE_WAIT -
tcp 1257 0 localhost:9000 localhost:46658 CLOSE_WAIT -
tcp 1321 0 localhost:9000 localhost:46641 CLOSE_WAIT -
tcp 0 0 localhost:46682 localhost:9000 ESTABLISHED 12909/nginx: worker
tcp 1321 0 localhost:9000 localhost:46633 CLOSE_WAIT -
tcp 1257 0 localhost:9000 localhost:36133 CLOSE_WAIT -
tcp 1336 0 localhost:9000 localhost:46676 ESTABLISHED -
tcp 1257 0 localhost:9000 localhost:36131 CLOSE_WAIT -
tcp 1257 0 localhost:9000 localhost:46660 CLOSE_WAIT -
tcp 1249 0 localhost:9000 localhost:36117 CLOSE_WAIT -
tcp 9 0 localhost:9000 localhost:36094 CLOSE_WAIT 14269/php5-fpm
tcp 0 0 localhost:46664 localhost:9000 FIN_WAIT2 -
tcp 1257 0 localhost:9000 localhost:36135 CLOSE_WAIT -
tcp 1257 0 localhost:9000 localhost:36125 CLOSE_WAIT -
tcp 9 0 localhost:9000 localhost:36102 CLOSE_WAIT 14268/php5-fpm
tcp 0 0 localhost:46662 localhost:9000 FIN_WAIT2 -
tcp 745 0 localhost:9000 localhost:46644 CLOSE_WAIT -
tcp 0 0 localhost:46658 localhost:9000 FIN_WAIT2 -
tcp 1265 0 localhost:9000 localhost:46607 CLOSE_WAIT -
tcp 0 0 localhost:46672 localhost:9000 ESTABLISHED 12909/nginx: worker
tcp 1257 0 localhost:9000 localhost:36119 CLOSE_WAIT -
tcp 1265 0 localhost:9000 localhost:46613 CLOSE_WAIT -
tcp 0 0 localhost:46646 localhost:9000 FIN_WAIT2 -
tcp 1257 0 localhost:9000 localhost:36137 CLOSE_WAIT -
tcp 0 0 localhost:46670 localhost:9000 ESTABLISHED 12909/nginx: worker
tcp 1265 0 localhost:9000 localhost:46619 CLOSE_WAIT -
tcp 1336 0 localhost:9000 localhost:46668 ESTABLISHED -
tcp 0 0 localhost:46648 localhost:9000 FIN_WAIT2 -
tcp 1336 0 localhost:9000 localhost:46670 ESTABLISHED -
tcp 9 0 localhost:9000 localhost:36108 CLOSE_WAIT 14274/php5-fpm
tcp 1336 0 localhost:9000 localhost:46684 ESTABLISHED -
tcp 0 0 localhost:46674 localhost:9000 ESTABLISHED 12909/nginx: worker
tcp 1336 0 localhost:9000 localhost:46666 ESTABLISHED -
tcp 1257 0 localhost:9000 localhost:46648 CLOSE_WAIT -
tcp 1336 0 localhost:9000 localhost:46678 ESTABLISHED -
tcp 0 0 localhost:46668 localhost:9000 ESTABLISHED 12909/nginx: worker
tcp 1321 0 localhost:9000 localhost:46635 CLOSE_WAIT -
tcp 0 0 localhost:46676 localhost:9000 ESTABLISHED 12909/nginx: worker
tcp 1336 0 localhost:9000 localhost:46682 ESTABLISHED -
tcp 1257 0 localhost:9000 localhost:36127 CLOSE_WAIT -
tcp 1265 0 localhost:9000 localhost:46617 CLOSE_WAIT -
tcp 0 0 localhost:46684 localhost:9000 ESTABLISHED -
tcp 1265 0 localhost:9000 localhost:46621 CLOSE_WAIT -
tcp 1265 0 localhost:9000 localhost:46623 CLOSE_WAIT -
tcp 1257 0 localhost:9000 localhost:46652 CLOSE_WAIT -
tcp 1321 0 localhost:9000 localhost:46627 CLOSE_WAIT -
tcp 1257 0 localhost:9000 localhost:46654 CLOSE_WAIT -
tcp 1265 0 localhost:9000 localhost:46611 CLOSE_WAIT -
tcp 0 0 localhost:46654 localhost:9000 FIN_WAIT2 -
tcp 0 0 localhost:46644 localhost:9000 FIN_WAIT2 -
tcp 0 0 localhost:46678 localhost:9000 ESTABLISHED 12909/nginx: worker
tcp 1321 0 localhost:9000 localhost:46637 CLOSE_WAIT -
tcp 1257 0 localhost:9000 localhost:36121 CLOSE_WAIT -
tcp 1265 0 localhost:9000 localhost:46605 CLOSE_WAIT -
tcp 1265 0 localhost:9000 localhost:46609 CLOSE_WAIT -
tcp 1336 0 localhost:9000 localhost:46680 ESTABLISHED -
tcp 1257 0 localhost:9000 localhost:46662 CLOSE_WAIT -
tcp 1321 0 localhost:9000 localhost:46631 CLOSE_WAIT -
tcp 0 0 localhost:46652 localhost:9000 FIN_WAIT2 -
tcp 0 0 localhost:46660 localhost:9000 FIN_WAIT2 -
tcp 1249 0 localhost:9000 localhost:36114 CLOSE_WAIT -
tcp 1336 0 localhost:9000 localhost:46672 ESTABLISHED -
tcp 1257 0 localhost:9000 localhost:36123 CLOSE_WAIT -
tcp 9 0 localhost:9000 localhost:36113 CLOSE_WAIT 14273/php5-fpm
tcp 1321 0 localhost:9000 localhost:46639 CLOSE_WAIT -
tcp 1265 0 localhost:9000 localhost:46615 CLOSE_WAIT -
tcp 1336 0 localhost:9000 localhost:46674 ESTABLISHED -
tcp 9 0 localhost:9000 localhost:36085 CLOSE_WAIT 14270/php5-fpm
tcp 0 0 localhost:46656 localhost:9000 FIN_WAIT2 -
tcp 0 0 localhost:46650 localhost:9000 FIN_WAIT2 -
tcp 9 0 localhost:9000 localhost:36097 CLOSE_WAIT 14271/php5-fpm
tcp 1257 0 localhost:9000 localhost:36129 CLOSE_WAIT -
tcp 1257 0 localhost:9000 localhost:46646 CLOSE_WAIT -
tcp 1257 0 localhost:9000 localhost:46650 CLOSE_WAIT -
tcp 0 0 localhost:46680 localhost:9000 ESTABLISHED 12909/nginx: worker
[/code]
There are plenty of "CLOSE_WAIT" and kind of pairs as like below:
[code]
tcp 1337 0 localhost:9000 localhost:46680 CLOSE_WAIT -
tcp 0 0 localhost:46680 localhost:9000 FIN_WAIT2 -
[/code]
Please note port [b]46680[/b] in above.
I enabled mysql slow queries error log, but it didn't work.
As of now restarting php5-fpm every minute via a cronjob (see command below) keeping everything running "smoothly" but I hate patchwork and want to solve this...
[code]
1 * * * * service php5-fpm restart > /dev/null
[/code]
I searched extensively on Google - got no help.
As mentioned, this a test-server in LAN, CPU load is never crossed 0.10 and memory usage is also below 25% (System has 2GB RAM and ubuntu-server installed)
So if you find its time-confusing to help me out, please atleast drop a hint.
Thanks in advance for help.
-Rahul
--
Rahul Bansal
EasyEngine - https://github.com/rtCamp/easyengine