--On Tuesday, June 25, 2013 03:10:17 PM -0700 Howard Chu hyc@symas.com wrote:
Bill MacAllister wrote:
--On Tuesday, June 25, 2013 12:58:54 PM -0700 Quanah Gibson-Mount quanah@zimbra.com wrote:
--On Tuesday, June 25, 2013 12:38 PM -0700 Bill MacAllister whm@stanford.edu wrote:
The load starts out at a rate of about 2 M/s. In the past I remember that dropping to something like 900 k/s and staying there. Now the load starts in the same place, but after 30 seconds it alternates between stalling out right, and a rate under 100 k/s. Dips as low as under 10 k/s and sometimes as high at 700 k/s. (My undergraduate degree was in watching water boil.)
What is the partition type? ext4?
What options are set for the partition in fstab?
This is what I am currently using. The UUID are obviously shortened for readability.
UUID=blah1 / ext4 defaults,acl,noatime,errors=remount-ro 0 1 UUID=blah2 /var/cache/openafs ext4 defaults,noatime 0 2 UUID=blah3 /var/lib/ldap ext4 defaults,noatime 0 2 UUID=blah4 none swap sw 0 0
I also tried ext3 with the same results. This is on a raid-1. I have also tried splitting the two disks and putting the OS on one and the LDAP database on the other. None of this moved the problem.
It really has the feel of a resource exhaustion. The load is now stalled in that the progress display is not updating. top does not show slapd as doing anything.
Probably bad default FS settings, and changed from your previous OS revision.
Also, you should watch vmstat while it runs to get a better idea of how much time the system is spending in I/O wait.
I have just re-mkfs'ed the new, slow system to make it look like the old, fast system. Just to make sure nothing else changed I have started a load on the older system. Things look fine.
Now, comparing vmstat output, the new system is clearly badness incarnate.
Fast ==== procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 6358656 303468 9301620 0 0 0 0 45 45 0 0 100 0 0 0 0 6358780 303468 9301620 0 0 0 0 47 41 0 0 100 0 0 0 0 6358780 303468 9301620 0 0 0 0 47 41 0 0 100 0 0 0 0 6358780 303468 9301620 0 0 0 0 93 43 0 1 99 0 0 0 0 6358532 303468 9301620 0 0 0 0 141 71 0 1 99 0 1 0 0 6358488 303468 9301620 0 0 0 14 116 48 0 1 99 0
Slow ==== procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 4 0 13318088 36128 2759600 0 0 0 2134 379 83 0 0 88 12 0 4 0 13318308 36128 2759600 0 0 0 1044 277 70 0 0 88 12 0 4 0 13318508 36132 2759600 0 0 0 765 267 69 0 0 88 12 0 2 0 13318240 36152 2759604 0 0 0 818 593 104 0 0 88 12 0 2 0 13318332 36168 2759604 0 0 0 2611 1489 138 0 0 89 11
Lots of waiting, lots of blocking. What's the deal with all that free memory on the slow system?
I will interate on mkfs for a bit, but I thought I would send this off incase something jumps out.
Bill