On Mon, Aug 08, 2005 at 06:11:05PM +0800, James Andrewartha wrote:
> Mike Horwath wrote:
> >On Fri, Aug 05, 2005 at 12:20:03PM +0800, James Andrewartha wrote:
>
> >How is your system doing for disk I/O? have you measured that yet?
>
> Bonnie++ (done while postfix turned off, hence no dpsam/pgsql load):
Doesn't help for real world, production, measurements.
This is a statistics generating program to measure what *it* can do
with your disk subsystems.
I am asking, have you, yourself, measured your disk I/O with DSPAM
doing its work?
iostat, vmstat, FreeBSD's systat can give you an idea of what is
happening.
> I can rebuild the fs, but I doubt that will help fix the root cause.
> Michael Anthon also noticed my iowait is high, but I think that's because I
> have so many transactions, and not vice versa.
High I/O wait times point also to disk thrashing.
> Running purge.sql has dropped it down to 1.1GB, and I've stuck it in
> to run daily before the nightly vacuum. This does seem to have
> decreased load a bit, but I'm still seeing periods with hundreds of
> transactions/second and 50%+ iowait.
And my idea to add RAM so that more data is cached isn't an option?
Perhaps striping the data across more spindles?
> >>For reference, the load this server was under was handled adequately
> >>by a dual p3 733 with 512MB ram running the same software, except
> >>with spamassassin instead of dspam/postgresql.
> >
> >CPU bound instead of disk bound. You are having a lot of I/O...
> >
> >I run UseNet News servers - if I were doing 200 ops per second, I
> >would normally see more like 30MB/sec or higher.
> >
> >Have you checked your indexes?
>
> What do you mean by this?
I run systems generating hundreds of megabits per second of data
transfer with 20+TB of disk space.
200 operations per second would be, for me, far higher throughput.
The backend disks on each of these RAID systems you see below are SATA
disks connected via an external enclosure and a hardware RAID
controller, connected to the host either via FC or SCSI.
An example (poor teal...):
[10:51am] 19 [~]:teal% iostat -c 10
tty amrd0 da0 acd0 cpu
tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id
1 7 0.00 0 0.00 55.61 1573 85.40 0.00 0 0.00 9 0 46 13 32
0 38 19.68 30 0.59 51.61 1265 63.76 0.00 0 0.00 9 0 73 16 2
0 38 36.00 8 0.28 51.09 1179 58.81 0.00 0 0.00 12 0 72 15 2
0 39 36.92 13 0.47 52.21 1192 60.78 0.00 0 0.00 8 0 78 12 2
0 38 26.29 14 0.35 52.91 1163 60.08 0.00 0 0.00 6 0 76 15 3
0 38 30.00 8 0.23 53.06 1160 60.13 0.00 0 0.00 8 0 78 14 0
0 39 56.00 6 0.32 51.77 1088 54.98 0.00 0 0.00 9 0 74 14 3
0 38 23.15 109 2.47 52.26 1059 54.04 0.00 0 0.00 9 0 68 18 5
0 39 96.00 1 0.09 55.28 1074 57.96 0.00 0 0.00 11 0 76 11 3
0 38 32.00 4 0.12 53.23 1130 58.76 0.00 0 0.00 8 0 76 15 1
This system is both CPU and I/O bound, but the partner system that
runs with it crashed and I need to bring it back online, at that time
the CPU loads will drop significantly, but the disk I/O won't change
much at all.
Another system (you'll need to stretch your window to review):
[10:53am] 11 [~]:aqua% iostat -c 10 da0 da1 da2 da3 da4 da5
tty da0 da1 da2 da3 da4 da5 cpu
tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id
0 2 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 60.00 424 24.86 4 0 18 6 72
0 64 45.19 47 2.06 46.00 58 2.59 48.77 47 2.23 60.68 308 18.27 48.50 48 2.26 44.34 70 3.01 5 0 31 6 59
0 64 43.73 44 1.86 47.94 63 2.97 48.90 40 1.89 61.06 338 20.14 47.44 36 1.65 46.43 28 1.26 5 0 24 7 64
0 64 47.85 27 1.25 45.05 38 1.66 46.27 59 2.68 60.94 317 18.85 53.10 40 2.05 41.00 8 0.32 3 0 19 7 70
0 63 46.72 68 3.12 51.13 77 3.86 48.56 63 3.01 61.31 287 17.20 44.67 42 1.81 27.20 5 0.13 1 0 21 7 71
0 64 44.80 15 0.65 45.29 55 2.45 50.00 51 2.51 60.33 274 16.16 54.00 8 0.42 0.00 0 0.00 3 0 21 7 69
0 64 45.52 42 1.85 43.00 51 2.16 54.67 33 1.74 61.82 270 16.32 33.54 13 0.42 52.62 13 0.66 4 0 20 6 71
0 64 46.00 20 0.89 48.87 23 1.09 46.98 58 2.68 62.06 348 21.07 44.71 28 1.21 8.00 1 0.01 5 0 23 6 65
0 63 45.68 87 3.89 48.20 119 5.59 42.46 151 6.28 60.48 291 17.19 46.84 116 5.30 46.79 85 3.89 3 0 35 11 51
0 64 45.27 62 2.76 47.34 72 3.34 49.70 72 3.51 60.70 319 18.90 46.67 12 0.54 48.32 37 1.73 3 0 25 7 66
Or individually:
tty da0 cpu
tin tout KB/t tps MB/s us ni sy in id
0 2 0.00 0 0.00 4 0 18 6 72
0 31 46.00 32 1.43 2 0 12 5 80
0 0 41.04 27 1.07 1 0 11 3 85
0 0 57.88 17 0.95 2 0 10 4 84
0 0 47.76 49 2.26 1 0 10 3 86
0 0 51.78 18 0.90 1 0 18 4 77
0 0 48.18 45 2.10 2 0 11 7 81
0 0 44.00 14 0.60 1 0 15 3 81
0 16 44.11 35 1.49 0 0 10 5 84
0 0 45.88 17 0.75 2 0 10 5 83
tty da1 cpu
tin tout KB/t tps MB/s us ni sy in id
0 2 0.00 0 0.00 4 0 18 6 72
0 32 45.49 51 2.26 2 0 11 5 81
0 0 48.17 70 3.31 1 0 11 3 85
0 0 48.83 48 2.27 2 0 10 4 84
0 0 46.51 51 2.30 1 0 10 3 86
0 0 56.80 30 1.65 1 0 18 4 77
0 0 45.66 57 2.56 2 0 11 7 81
0 0 43.10 57 2.42 1 0 15 3 81
0 16 49.48 53 2.58 0 0 10 5 84
0 0 50.58 72 3.57 2 0 10 5 83
tty da2 cpu
tin tout KB/t tps MB/s us ni sy in id
0 2 0.00 0 0.00 4 0 18 6 72
0 15 39.20 20 0.76 2 0 12 5 81
0 0 40.10 39 1.51 1 0 10 3 86
0 0 47.26 38 1.74 2 0 10 4 84
0 0 46.48 29 1.30 1 0 10 3 86
0 0 43.85 26 1.10 1 0 18 4 77
0 0 42.43 28 1.15 2 0 11 7 80
0 0 47.08 39 1.78 1 0 14 3 81
0 16 43.78 37 1.57 1 0 10 5 84
0 0 50.06 31 1.50 2 0 10 5 83
tty da3 cpu
tin tout KB/t tps MB/s us ni sy in id
0 2 0.00 0 0.00 4 0 18 6 72
0 0 60.17 164 9.61 2 0 12 6 81
0 0 57.80 150 8.45 1 0 10 3 86
0 0 57.81 159 9.00 2 0 10 4 84
0 0 57.10 176 9.83 1 0 10 3 86
0 0 58.56 234 13.36 1 0 18 4 77
0 0 58.42 170 9.72 2 0 11 7 80
0 0 59.50 175 10.18 1 0 14 3 81
0 16 57.58 201 11.30 1 0 10 5 84
0 0 57.48 123 6.89 2 0 10 5 83
tty da4 cpu
tin tout KB/t tps MB/s us ni sy in id
0 2 0.00 0 0.00 4 0 18 6 72
0 15 49.21 33 1.58 2 0 12 6 81
0 0 45.80 20 0.89 1 0 10 3 86
0 0 44.12 32 1.37 2 0 10 4 84
0 0 40.15 51 2.02 1 0 10 3 86
0 0 46.33 12 0.54 1 0 18 4 77
0 0 45.20 10 0.44 2 0 11 7 80
0 0 46.00 14 0.62 1 0 14 3 81
0 16 45.89 19 0.84 1 0 10 5 84
0 0 40.50 8 0.31 2 0 10 5 83
tty da5 cpu
tin tout KB/t tps MB/s us ni sy in id
0 2 0.00 0 0.00 4 0 18 6 72
0 0 46.70 37 1.69 2 0 12 5 81
0 0 44.62 57 2.50 1 0 10 3 86
0 0 46.81 73 3.35 2 0 10 4 84
0 0 53.03 31 1.59 1 0 10 3 86
0 0 45.09 80 3.53 1 0 18 4 77
0 0 44.54 80 3.49 2 0 12 7 80
0 0 42.88 63 2.65 1 0 14 3 81
0 16 41.29 28 1.12 1 0 10 5 84
0 0 47.02 60 2.77 2 0 10 5 83
Each of those filesystems is 1TB in size and data is written and
requested very randomly.
Add some RAM, stripe some disks, IDE disks suck for high I/O, and
striping them would do wonders I am sure. (SATA is IDE...)
-- Mike Horwath, reachable via drechsau@Geeks.ORGReceived on Mon Aug 8 12:00:42 2005
This archive was generated by hypermail 2.1.8 : Thu Sep 29 2005 - 13:51:28 EDT