Did you get to the bottom of this?
On Thu, Apr 23, 2015 at 08:29:48PM +1000, Geoff Swan wrote:
On 2015-04-23 5:56 PM, Howard Chu wrote:
In normal (safe) operation, every transaction commit performs 2 fsyncs. Your 140MB/s throughput spec isn't relevant here, your disk's IOPS rate is what matters. You can use NOMETASYNC to do only 1 fsync per commit.
Decent SAS disks spin at 10,000 or 15,000 RPM so unless there is a non-volatile memory cache in there I would expect at most 15000/60 = 250 fsyncs per second per drive, giving 125 transaction commits per second per drive.
OK. I ran a reduced version of test script (20 processes each performing 40 read/write operations) with normal (safe) mode of operation on a test server that has 32GB RAM, and everything else identical to the server with 128GB.
So that is just 800 operations taking 60s?
A quick test using vmstat at 1s intervals gave the following output whilst it was running.
procs ---------------memory-------------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 20 0 0 32011144 167764 330416 0 0 1 15 40 56 0 0 99 1 0 0 0 0 31914848 167764 330424 0 0 0 1560 2594 2130 2 1 97 0 0 0 0 0 31914336 167764 330424 0 0 0 1708 754 1277 0 0 100 0 0 0 0 0 31914508 167772 330420 0 0 0 2028 779 1300 0 0 99 1 0
The script took about 60s to complete, which is a lot longer than expected. It appears almost all I/O bound, at a fairly slow rate (1500 blocks in a second is 6MB/s).
As you say, it is IO bound (wa ~= 100%). Stop worrying about MB/s: the data rate is irrelevant, what matters is synchronous small-block writes and those are limited by rotation speed.
Are you absolutely certain that the disks are SAS? Does your disk controller believe it? I had big problems with an HP controller once that refused to run SATA drives at anything like their full speed as it waited for each transaction to finish and report back before queuing the next one...
Andrew