Welcome! Log In Create A New Profile

Advanced

Re: high "Load Average"

March 16, 2010 11:53AM
Stefan Parvu Wrote:
-------------------------------------------------------
> I meant nicstat. Works on Linux, Solaris.
> Probable worth of considering porting this to FreeBSD too.

here are latest results from netstat and nicstat tools
[code]
netstat -ant | grep ESTABLISHED | egrep ':80\>' | wc -l
1157
[/code]

We can say 1157 HTTP sessions are established (and may be kept alive)

nicstat
[code]
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
10:24:02 lo 0.33 0.33 290.8 290.8 1.16 1.16 0.00 0.00
10:24:02 eth0 0.29 0.79 1183.8 1448.1 0.25 0.56 0.00 0.00
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
10:24:03 lo 153.7 153.7 338.6 338.6 464.7 464.7 0.00 0.00
10:24:03 eth0 1070.9 768.7 3599.4 3987.9 304.7 197.4 0.00 0.00
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
10:24:04 lo 65.22 65.22 191.9 191.9 348.1 348.1 0.00 0.00
10:24:04 eth0 932.5 691.3 3250.8 3437.7 293.7 205.9 0.00 0.00
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
10:24:05 lo 522.5 522.5 687.5 687.5 778.2 778.2 0.00 0.00
10:24:05 eth0 1075.6 988.8 3758.0 4308.4 293.1 235.0 0.00 0.00
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
10:24:06 lo 235.0 235.0 551.9 551.9 436.0 436.0 0.00 0.00
10:24:06 eth0 855.4 857.6 3476.3 3761.2 252.0 233.5 0.00 0.00
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
10:24:07 lo 135.3 135.3 351.9 351.9 393.6 393.6 0.00 0.00
10:24:07 eth0 1021.0 1025.4 3970.6 4591.4 263.3 228.7 0.00 0.00
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
10:24:08 lo 168.1 168.1 492.3 492.3 349.6 349.6 0.00 0.00
10:24:08 eth0 793.6 864.8 3340.2 3875.5 243.3 228.5 0.00 0.00
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
10:24:09 lo 32.88 32.88 78.97 78.97 426.4 426.4 0.00 0.00
10:24:09 eth0 551.1 462.1 2416.2 2461.1 233.6 192.3 0.00 0.00
[/code]

> Good is to have latest patches applied etc. To me
> this is a good candidate for DTrace (Solaris) or SystemTap (linux).
getting acquainted with SystemTap, looks like not "5 minutes to learn" system

Cliff Wells Wrote:
-------------------------------------------------------
I've arranged answers regarding to questions grouped by "theme"

> Well, all it means is that the bottleneck lies outside PHP.
> I wouldn't read too much else into it.
> It just means that 99% of your slowness is elsewhere.

> I cannot possibly fathom how the length of T doesn't matter.
> To me that is probably the single most important data point to investigate.
> The entire response time directly depends on T and you say it "doesn't matter"?

> Why do you think machine A is slow?
> Machine A waits for machine B and you blame machine A.
> I don't see how you arrive at this conclusion.
> To me it seems it could be either machine (or the connection between the machines)
> and more testing needs to be done to arrive at such a conclusion.

Combined answer on all these questions. Looks like we are talking about not the same thing in here. You are talking about response time for network request issued by user. It really consists of several stages of processing in here. Like
1. HTTP request handled by nginx
2. data processing by PHP (provided via CGI interface with php-fpm in my case)
3. external DB request processing (aka T in our discussion)
4. response to user
and in this case I totally agree with you - time T does matter and directly influences response time.

What I am talking about is a little bit different. In peak hours response time degrades significantly, but is still more or less acceptable, but what is unacceptable is that machine A slows down and replies for external actions (like SSH login, VPN connection) very slowly. For example, I sometimes even can't establish VPN connection to it due to timeouts. (there is openvpn server running on it). That's why I am talking about "slow machine A" and blame it. That's why I am worried about "uninterruptible sleep" processes and thinking about scheduling lag

> I'd be curious to see the output of iotop on your
> MySQL server as well.

here it is:
[code]
Total DISK READ: 0.00 B/s | Total DISK WRITE: 0.00 B/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
23144 be/4 mysql 0.00 B/s 19.10 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
880 be/4 mysql 0.00 B/s 11.46 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
1247 be/4 mysql 0.00 B/s 11.46 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
1287 be/4 mysql 0.00 B/s 53.49 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
1305 be/4 mysql 0.00 B/s 22.92 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
7929 be/4 mysql 0.00 B/s 15.28 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
30493 be/4 mysql 0.00 B/s 15.28 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
10222 be/4 mysql 0.00 B/s 7.64 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
1 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % init [3]
2 be/3 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kthreadd]
3 rt/3 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/0]
[/code]

and another sample

[code]
Total DISK READ: 0.00 B/s | Total DISK WRITE: 107.53 K/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
709 be/3 root 0.00 B/s 55.62 K/s 0.00 % 45.85 % [kjournald]
4129 be/4 mysql 0.00 B/s 3.71 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
4147 be/4 mysql 0.00 B/s 3.71 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
4233 be/4 mysql 0.00 B/s 7.42 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
4273 be/4 mysql 0.00 B/s 7.42 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
6587 be/4 mysql 0.00 B/s 14.83 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
15008 be/4 mysql 0.00 B/s 3.71 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
7449 be/4 mysql 0.00 B/s 11.12 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
[/code]

looks like nothing special for me. MySQL is running, not much I/O

>> May be it is something related to task scheduling? Is big number of sleeping process impacts performance and/or slows down scheduler?

> Not since kernel 2.6, AFAIK. How many processes are we talking about here?
nginx worker_processes 4
php-pfm max_children 1000 total

and system processes of course, but nothing much

> Maybe. Or maybe you've simply disguised the
> problem by throwing more processes at it. How many PHP processes are you
> running? Can you provide your php-fpm parameters? Also, what's an
> approximation of your requests per second during these peak times?

2 pools with 500 php-fpm children each and something like 1000-2000 concurrent HTTP sessions

> As an aside, one thing that you mentioned earlier that I was wondering
> about: what is writing to the local disk at 3.27MB/s (from iotop output)?

[code]
Total DISK READ: 0.00 B/s | Total DISK WRITE: 3.56 M/s
TID PRIO USER DISK READ DISK WRITE> SWAPIN IO COMMAND
32722 be/4 nginx 0.00 B/s 207.89 K/s 0.00 % 0.00 % nginx: worker process
22705 be/4 nobody 0.00 B/s 107.30 K/s 0.00 % 0.00 % php-cgi --fpm --fpm-config /etc/php-fp
22235 be/4 nobody 0.00 B/s 83.83 K/s 0.00 % 0.00 % php-cgi --fpm --fpm-config /etc/php-fp
21203 be/4 nobody 0.00 B/s 57.00 K/s 0.00 % 0.00 % php-cgi --fpm --fpm-config /etc/php-fp
32719 be/4 nginx 0.00 B/s 50.30 K/s 0.00 % 55.19 % nginx: worker process
32718 be/4 nginx 0.00 B/s 10.06 K/s 0.00 % 0.00 % nginx: worker process
21195 be/4 nobody 0.00 B/s 3.35 K/s 0.00 % 0.00 % php-cgi --fpm --fpm-config /etc/php-fp
21327 be/4 nobody 0.00 B/s 3.35 K/s 0.00 % 0.00 % php-cgi --fpm --fpm-config /etc/php-fp
22189 be/4 nobody 0.00 B/s 3.35 K/s 0.00 % 0.00 % php-cgi --fpm --fpm-config /etc/php-fp
[/code]
php-cgi and nginx are writing. php is writing some processed data and logs. Anyway, is 3.5 M/s is not a big deal for SCSI disk, isn't it?

By the way here: http://www.mysqlperformanceblog.com/2006/12/04/using-loadavg-for-performance-optimization/ is couple of words about slowing down systems with active network IO.

Best regards,
Sessna
Subject Author Posted

high "Load Average"

Sessna March 12, 2010 12:34PM

Re: high "Load Average"

Cliff Wells March 12, 2010 02:12PM

Re: high "Load Average"

Sessna March 13, 2010 02:29AM

Re: high "Load Average"

Cliff Wells March 13, 2010 11:48AM

Re: high "Load Average"

Sessna March 14, 2010 05:15AM

Re: high "Load Average"

Cliff Wells March 15, 2010 06:36PM

Re: high "Load Average"

Stefan Parvu March 13, 2010 12:10PM

Re: high "Load Average"

Sessna March 14, 2010 07:46AM

Re: high "Load Average"

Stefan Parvu March 14, 2010 12:48PM

Re: high "Load Average"

Sessna March 15, 2010 04:55AM

Re: high "Load Average"

Stefan Parvu March 15, 2010 08:10AM

Re: high "Load Average"

Sessna March 16, 2010 11:53AM

Re: high "Load Average"

Stefan Parvu March 16, 2010 12:40PM

Re: high "Load Average"

Sessna March 18, 2010 06:01AM

Re: high "Load Average"

Cliff Wells March 16, 2010 01:36PM

Re: high "Load Average"

Rob Mueller March 23, 2010 11:04PM

Re: high "Load Average"

Yu Sun March 19, 2010 11:14PM



Sorry, only registered users may post in this forum.

Click here to login

Online Users

Guests: 169
Record Number of Users: 8 on April 13, 2023
Record Number of Guests: 421 on December 02, 2018
Powered by nginx      Powered by FreeBSD      PHP Powered      Powered by MariaDB      ipv6 ready