I'm having what appears to be a fairly common problem with PHP-FPM. Every now and again, I get a string of errors in my log like this...
recv() failed (104: Connection reset by peer) while reading response header from upstream, client:
recv() failed (104: Connection reset by peer) while reading response header from upstream, client:
recv() failed (104: Connection reset by peer) while reading response header from upstream, client:
etc
I found this in the forums from a few months back
[i]2. The typical problem we have encountered when php pages suddenly stop processing is either all the forked childs are doing some long (unintended) running scripts (as the inbuilt max_max_execution_time doesnt always work (if at all) as expected) or just have been hanged so the master process has no free childs to assign the incomming request.
Thats why you:
- spawn more than just few childs. While the typical approach is to like go by cpu core count we have experienced that adding some multiplier like 3 - 4x works better as the php code tends usually to wait more from external resources (DBs etc) rather than processing code
- use the great features of php-fpm to monitor which scripts take too long to execute and kill those who are taking too long.
Like we use:
<value name="request_slowlog_timeout">30s</value>
<value name="request_terminate_timeout">60s</value>
Which means that requests taking more than 30 seconds to compute will be logged (backtraced) and those taking longer than minute killed by force.[/i]
I figured this would be the first place to start, how do I spawn more children for php?
I would like to track down the offending script as well - how do I use the <value name="request_slowlog_timeout"> items that were talked about above?
Thanks
Flash