[KLUG Members] What is wrong....

Adam Bultman members@kalamazoolinux.org
Fri, 9 Jan 2004 18:03:29 -0500 (EST)


> ----
> 15:03:23  up 78 days,  2:08, 10 users,  load average: 6.11, 6.05, 6.01
> 98 processes: 97 sleeping, 1 running, 0 zombie, 0 stopped
> CPU0 states:   0.3% user   0.1% system    0.0% nice   0.0% iowait  99.1% idle
> CPU1 states:   0.0% user   0.1% system    0.0% nice   0.0% iowait  99.4% idle
> Mem:   254684k av,  251300k used,    3384k free,       0k shrd,   32152k buff
>                     180528k actv,       0k in_d,    5016k in_c
> Swap:  522104k av,   29060k used,  493044k free                   94396k cached
> 
>   PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME CPU COMMAND
> 26510 root      15   0  1212 1212   864 R     0.7  0.4   0:00   0 top
> ...
> The tasks are sorted in descending order by CPU use.
> ----
> 
> 
> OK... so why is the load so high, when CPU use is so low?
> 

That's probably going to be disk I/O.  I have a mail server:

  3:59pm  up 14 days, 22:01,  1 user,  load average: 3.62, 3.87, 3.68
197 processes: 196 sleeping, 1 running, 0 zombie, 0 stopped
CPU0 states: 13.0% user,  8.0% system,  0.0% nice, 77.0% idle
CPU1 states:  4.0% user,  5.1% system,  0.0% nice, 89.0% idle
Mem:   917080K av,  892792K used,   24288K free,       0K shrd,  492224K 
buff
Swap:  273088K av,       0K used,  273088K free                  126688K 
cached


Certainly not 3.62's worth of load, yes?  It's normally around 4.5, and 
will shoot up to 6 during heavy times.

[adam@luke adam]$ vmstat
   procs                      memory    swap          io     system         
cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  
sy  id
 0  4  0      0  23500 492224 126360   0   0     9    14   15     8  13   
5  15

I've got some I/O goin' on on the mail server, and it's been like this for 
over 25 days - all I/O.  It has software mirrored 72 GB SCSI drives.  
Loopback filesystems are even worse for I/O load, too.  

Check your iostat, see what your disks are doing, etc - I'm betting it's 
disk or network I/O (a stale NFS handle will give you a load of 1 if it is 
mounted hard, and if you have a program trying to access it).  If any of 
you know how to more accurately check disk I/O, chime in. Sar will tell 
you, I think, too, but sar isn't rnning on this box for me.

Adam