How Much Storage Do You Need?
If you’re not familiar with storage performance then you’re probably think about space requirements first. Then you only need that all works smooth and fine.
Unfortunately it does not work that way.
With some storage solutions you’ll need to have 30-50% more space than your data will consume to works good. It’s may look ridiculous but that’s how ZFS works with it copy-on-write technology.
With most other traditional storage solutions you’ll need to plan what your read/write ratio is, how much IOPS do you need and what average latency will you good with.
If you haven’t this info and doesn’t know where and how get it - this is a right article for you then!
We can start with ‘iostat’ command at bash console (you’ll probably need the ‘sysstat’ package installed):
Let’s look for the output (I’d formatted it a bit):
root@testserver:~# iostat -xcd
Linux 3.2.0-4-amd64 (testserver) 04/21/2016 _x86_64_ (8 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
0.50 0.00 0.23 10.50 0.02 88.76
Device: vda
rrqm/s: 20.70
wrqm/s: 47.44
r/s: 0.91
w/s: 12.01
rkB/s: 26.18
wkB/s: 237.91
avgrq-sz: 40.89
avgqu-sz: 2.35
await: 181.49
svctm: 7.76
%util: 10.03
The command above will give you average data starting from system boot, and if you run it with -t key, you can measure storage performance metrics online. But we deal with average numbers for this moment:
You’ll need at least this metrics:
- rrqm/s - reads per second requested from the apps
- wrqm/s - writes per second requested from the apps, which will give your IOPS in summary
- r/s - actual reads from the storage device
- w/s - actual writes from the storage device
- await - average latency of all requests
If we’ll look for the numbers, we can have some conclusions:
- Read requests are cached or sequential and was effectively merged: for 20.7 read IO requests only 1 IO request to the device was actually executed.
- Write requests was either random or can’t be cached, so for 47 write IO queries - 12 actual IO requests was executed.
- Average throughput was not huge, so there is probably random IO and some low-level storage device underneath.
- Average latency is far not good, so looks like we have a problem here.
The problem is that high latency like at example above effectively limiting your IOPS, so you can’t say how much IOPS are you need here!
To better understanding you can imagine it that way: if you serves each request for 181 ms., you can’t serve more than 5 or 6 requests at the second! You can add a queue to the requests, so they can be served simultaneously when possible, and it will help a bit. But then we’ll have another problem: while most of requests can be handled at reasonable 10-20 ms, some unlucky requests will have 100-200 ms or even more time to complete!
I hope that I didn’t completely mess you up to this moment!
You’ll need to run ‘iostat’ with ‘-t’ key for some time to have real-time data about your storage load, so you can see IO peaks too, not only average numbers.
When you’ll have all the data you need, you can look for your current storage activity:
- How much IOPS do you have?
- How good your latency is?
- What latency do you need to have?
You probably need to look into some hybrid storage solution if you need more than few hundreds IOPS, look into minimal latency possible, and also need a good storage space available.
Current hybrid storage solutions can offer you tremendous performance both with read and write applications and very cost-effective, unlike all-flash solutions.
License: Creative Commons image source
David Kovacs is passionate about entrepreneurship and triathlon. Presently, he is spending most of his time with his friends trying to kick off an awesome project.
Reader Comments