Back onto the subject of striping, here's a thought that I've harboured for a little while, but never actually gotten around to testing it formally. Perhaps sometime soon I'll find the time to give it a thorough test and analysis and report back..
Common knowledge tells us to take the average size of a read or write operation (dependant on whether an app is read- or write- mostly) and to divide it by the number of spindles that we are striping against in order to calculate the correct column width.
I'm not so sure.
The only way that this mythical average operation will utilise every spindle once is if (and only if) it starts at block zero of the stripe.
Another
quick calculation here. A typical stripe width is 1MB (2048 blocks),
our chance of the operation starting at block zero is about 0.049%.
Let's also think about what happens when an "average" operation starts at block n, (where n>0).
The operation will clearly "wrap around" the stripe, and partially
write onto the very next column (which happens to be the same disk that
we started on)
There is a relatively simple solution, and that is how I belive we ought
to be calculating the column widths: Take the size of the "average"
operation, divide by the number of spindles/LUNs (and here's the
important bit).. minus one.
So for example, we have an average operation of 1MB, and eight spindles. We can calculate our desired column width by:
1024KB / (8-1) ~= 146KB
Conventional wisdom would have placed this at:
1024KB / 8 = 128KB
The
next question is likely to be "will this have a major impact?", and
therefore "do we care?". Not sure to be honest. I'd like to find the
time to fire up Dtrace for a proper test. It seems likely that using
the larger column width ought to reduce the number of operations, and
possibly then reduce contention.

0 Trackbacks