FPM - fails high / low loads

JustaNginx

FPM - fails high / low loads
June 21, 2013 12:41PM

Registered: 10 years ago
Posts: 5

I've got a server
CentOS 6.4 32bit
16gig mem
300gig HD
CPU - 8 cores 2.67ghz

The problem I am having is I am trying to ab test I've done both this

ab -v -k -r -n 5000 -c 200 http://172.16.33.46/?ip which results in this

Server Software: nginx
Server Hostname: 172.16.33.46
Server Port: 80

Document Path: /?ip
Document Length: 873 bytes

Concurrency Level: 200
Time taken for tests: 13.386 seconds
Complete requests: 5000
Failed requests: 466
(Connect: 0, Receive: 0, Length: 466, Exceptions: 0)
Write errors: 0
Total transferred: 5069485 bytes
HTML transferred: 4364485 bytes
Requests per second: 373.54 [#/sec] (mean)
Time per request: 535.423 [ms] (mean)
Time per request: 2.677 [ms] (mean, across all concurrent requests)
Transfer rate: 369.85 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 44 26.9 55 112
Processing: 1 135 421.3 63 10374
Waiting: 1 119 423.2 41 10374
Total: 2 179 420.6 122 10384

Percentage of the requests served within a certain time (ms)
50% 122
66% 128
75% 133
80% 139
90% 167
95% 1011
98% 1127
99% 1324
100% 10384 (longest request)

With the nginx error log

2013/06/21 12:26:18 [error] 2181#0: *1093083 upstream prematurely closed connection while reading response header from upstream, client: 172.16.33.235, server: 127.0.0.1, request: "GET /?ip HTTP/1.0", upstream: "fastcgi://127.0.0.1:9000", host: "172.16.33.46"
2013/06/21 12:26:18 [error] 2181#0: *1092993 upstream prematurely closed connection while reading response header from upstream, client: 172.16.33.235, server: 127.0.0.1, request: "GET /?ip HTTP/1.0", upstream: "fastcgi://127.0.0.1:9000", host: "172.16.33.46"
2013/06/21 12:26:19 [error] 2185#0: *1093429 upstream prematurely closed connection while reading response header from upstream, client: 172.16.33.235, server: 127.0.0.1, request: "GET /?ip HTTP/1.0", upstream: "fastcgi://127.0.0.1:9000", host: "172.16.33.46"

and the FPM doens't really go more than 4 or 5 active processes as per my config (below)

I've also tried doing
ab -v -k -r -n 50000 -c 2000 http://172.16.33.46/?ip

Which ranges around 26-35active processes with 55or 15 or so idle ones

Server Software: nginx
Server Hostname: 172.16.33.46
Server Port: 80

Document Path: /?ip
Document Length: 873 bytes

Concurrency Level: 2000
Time taken for tests: 111.666 seconds
Complete requests: 50000
Failed requests: 5512
(Connect: 0, Receive: 0, Length: 5512, Exceptions: 0)
Write errors: 0
Non-2xx responses: 1243
Total transferred: 50288584 bytes
HTML transferred: 43209227 bytes
Requests per second: 447.77 [#/sec] (mean)
Time per request: 4466.626 [ms] (mean)
Time per request: 2.233 [ms] (mean, across all concurrent requests)
Transfer rate: 439.79 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 40 91.3 12 1216
Processing: 1 1221 4763.5 27 95397
Waiting: 1 1186 4610.1 20 95397
Total: 2 1261 4770.0 45 95411

Percentage of the requests served within a certain time (ms)
50% 45
66% 116
75% 315
80% 1018
90% 1838
95% 6716
98% 17289
99% 25539
100% 95411 (longest request)

and the error log fills up with these.
2013/06/21 12:30:20 [error] 2240#0: *1214504 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 172.16.33.235, server: 127.0.0.1, request: "GET /?ip HTTP/1.0", upstream: "fastcgi://127.0.0.1:9000", host: "172.16.33.46"
2013/06/21 12:30:20 [info] 2241#0: *1220114 client prematurely closed connection, so upstream connection is closed too while sending request to upstream, client: 172.16.33.235, server: 127.0.0.1, request: "GET /?ip HTTP/1.0", upstream: "fastcgi://127.0.0.1:9000", host: "172.16.33.46"

I've spent the past few hrs googling and reading these forums trying to self help but after no success I figured I'd ask. I've tried switching between sockets and IP .. and sockets are way worse at errors than IP which makes no sense .. everywhere I have read says sockets are better..

below are all my configs

(oh and the page that I'm requesting is just php doing a <?print_r($_SERVER);?>

user nobody nobody;
worker_processes 8;
worker_rlimit_nofile 131072;
pid /var/run/nginx.pid;

events {
worker_connections 30000;
}
http {
include /etc/nginx/conf/mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';

access_log off;
error_log /var/log/nginx/error.log debug;

#gzip on;

sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 60;
types_hash_max_size 2048;
server_tokens off;

server {
listen 80;
server_name 127.0.0.1;

location / {
real_ip_header X-Forwarded-For;
real_ip_recursive on;
try_files $uri $uri/ /index.php;
root /www;
index index.html index.htm index.php;
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root /etc/nginx/html;
}
location = /clear.gif {
empty_gif;
}
location ~ [^/]\.php(/|$) {
root /www;
fastcgi_split_path_info ^(.+?\.php)(/.*)$;
if (!-f $document_root$fastcgi_script_name) {
return 404;
}
fastcgi_pass 127.0.0.1:9000;
fastcgi_index index.php;
include /etc/nginx/conf/fastcgi_params;
fastcgi_connect_timeout 60;
fastcgi_send_timeout 180;
fastcgi_read_timeout 180;

fastcgi_buffer_size 128k;
fastcgi_buffers 256 16k;
fastcgi_busy_buffers_size 256k;
fastcgi_temp_file_write_size 256k;
fastcgi_intercept_errors on;
fastcgi_max_temp_file_size 0;
}

location /nginx_status {
stub_status on;
access_log off;
allow 172.16.33.235;
allow 172.16.5.230;
deny all;
}
location ~ ^/(status|ping)$ {
fastcgi_pass 127.0.0.1:9000;
include /etc/nginx/conf/fastcgi_params;
access_log off;
allow 172.16.33.235;
allow 172.16.5.230;
}
}
}

[global]
pid = run/php-fpm.pid
error_log = /var/log/nginx/fpm.log

[www]
listen = 127.0.0.1:9000
listen.allowed_clients = 127.0.0.1
listen.backlog = -1

user = nobody
group = nobody

pm.status_path = /status
ping.path = /ping

pm = dynamic
pm.max_children = 75
pm.start_servers = 25
pm.min_spare_servers = 15
pm.max_spare_servers = 75
pm.max_requests = 30000
request_terminate_timeout = 30
rlimit_files = 131072
rlimit_core = unlimited
catch_workers_output = yes

## sysctl.conf

net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 4294967295
kernel.shmall = 268435456

fs.file-max = 262144
kernel.pid_max = 262144
net.ipv4.tcp_rmem = 4096 87380 8388608
net.ipv4.tcp_wmem = 4096 87380 8388608
net.ipv4.netfilter.ip_conntrack_max = 65536
net.core.rmem_max = 25165824
net.core.rmem_default = 25165824
net.core.wmem_max = 25165824
net.core.wmem_default = 131072
net.core.netdev_max_backlog = 8192
net.ipv4.tcp_window_scaling = 1
net.core.optmem_max = 25165824
net.core.somaxconn = 65536
net.ipv4.ip_local_port_range = 1024 65535
kernel.shmmax = 4294967296
vm.max_map_count = 262144

## limits.conf

* soft nofile 131072
* hard nofile 131072
* soft nproc 32000
* hard nproc 32000

* soft core unlimited

core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 120689
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 131072
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 32000
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

Any help would be wonderful.. I'm still searching for any help online.. so if I find anything I'll post here .. in case anyone else has the same problem. as I've found numerous 'questions' asked dealing with the same problem, but never any solution..

thanks..

Reply Quote

JustaNginx

Re: FPM - fails high / low loads
June 21, 2013 01:33PM

Registered: 10 years ago
Posts: 5

update..

changing to
pm.max_requests = 500

I was able to get the socket to work the same as the IP.. I thought this meant something else.. but it means after 500 requests it'll respawn..

Tried doing 0 but gives same result as 500.

Reply Quote

JustaNginx

Re: FPM - fails high / low loads
June 21, 2013 01:45PM

Registered: 10 years ago
Posts: 5

sractch this.. I was editing the wrong server.. this below doesn't work on sockets still getting the same problem with sockets being worse and IP being less worse.. happens about 10 seconds after the ab start.

JustaNginx Wrote:
-------------------------------------------------------
> update..
>
> changing to
> pm.max_requests = 500
>
> I was able to get the socket to work the same as the IP.. I thought
> this meant something else.. but it means after 500 requests it'll
> respawn..
>
> Tried doing 0 but gives same result as 500.

Reply Quote

JustaNginx

Re: FPM - fails high / low loads
June 24, 2013 12:55PM

Registered: 10 years ago
Posts: 5

Updating the somaxconn to 10240 I was able to get the socket to work.. and I changed a few other things but see below for ref.. hope this helps others out there.

Nginx config
fastcgi_pass 127.0.0.1:9000; now is
fastcgi_pass unix:/var/run/php5-fpm.sock;

changed all the timeouts to 20
fastcgi_connect_timeout 20;
fastcgi_send_timeout 20;
fastcgi_read_timeout 20;
FPM config
listen = 127.0.0.1:9000
listen.allowed_clients = 127.0.0.1
listen.backlog = -1
is now
listen = /var/run/php5-fpm.sock
listen.allowed_clients = 127.0.0.1
listen.backlog = 10240

and the changed one item in the sysctl.conf
net.core.somaxconn = 10240

for my PHP FPL config
[global]
pid = run/php-fpm.pid
error_log = /var/log/nginx/fpm.log

[www]
listen = /var/run/php5-fpm.sock
listen.allowed_clients = 127.0.0.1
listen.backlog = 10240

user = nobody
group = nobody

pm.status_path = /status
ping.path = /ping

pm = dynamic
pm.max_children = 150
pm.start_servers = 20
pm.min_spare_servers = 10
pm.max_spare_servers = 30
pm.max_requests = 15000
request_terminate_timeout = 20s
rlimit_files = 131072
rlimit_core = unlimited
catch_workers_output = no

request_slowlog_timeout = 5s
slowlog = /var/log/nginx/fpm_slow.log

Running
ab -r -k -n 100000 -c 5000 http://172.16.33.46/?ip

I can get a 100% success rate.. now onto figuring out how to get a 100% success rate on the connections to mongo.. as I’m getting

[24-Jun-2013 12:53:14] [pool www] pid 9269
script_filename = /www/index.php
[0xb778e8d4] __construct() /www/index.php:9
[0xb778dd30] mdb() /www/index.php:52
[0xb778d8a0] +++ dump failed

[24-Jun-2013 12:49:56] [pool www] pid 8836
script_filename = /www/index.php
[0xb778dd30] update() /www/index.php:57
[0xb778d8a0] +++ dump failed

On 100K requests I max out all 150 processes and 350-400 of them fail with the above errors which is mostly the connection to mongo in php.

Reply Quote

JustaNginx

Re: FPM - fails high / low loads
June 24, 2013 02:53PM

Registered: 10 years ago
Posts: 5

changing php fpm config to the following allows me to have 4000-8000 concurrent connections at once and 0 failure rates.. after running numerous ab tests.. it is now in production and running 2 boxes side by side and accepting requests perfectly.. it's only about 30% ramped up due to proxy servers.. but hitting it with ab I was very stable. if anything thing changes I'll report back in this but all seems good thus far

[global]
pid = run/php-fpm.pid
error_log = /var/log/nginx/fpm.log

[www]
listen = /var/run/php5-fpm.sock
listen.allowed_clients = 127.0.0.1
listen.backlog = 10240

user = nobody
group = nobody

pm.status_path = /status
ping.path = /ping

pm = dynamic
pm.max_children = 50
pm.start_servers = 10
pm.min_spare_servers = 5
pm.max_spare_servers = 30
pm.max_requests = 0
request_terminate_timeout = 15s
rlimit_files = 10240
rlimit_core = unlimited
catch_workers_output = no

request_slowlog_timeout = 10s
slowlog = /var/log/nginx/fpm_slow.log

Reply Quote

FPM - fails high / low loads

Online Users