At Open vStorage we now have various large clusters which can easily deliver multiple millions of IOPS. For some customers it is even a prestige project to produce the highest amount of IOPS on their Open vStorage dashboard. Out of the box Open vStorage will already give you very decent performance but there a few nuts and bolts you can tweak to increase the performance of your environment. There is no golden rule to increase the performance but below we share some tips and pointers:
The most obvious way to influence the IO performance of a vDisk is by selecting the appropriate settings in the vDisk detail page.The impact of the DTL setting was already covered in a previous blog post so we will skip it here.
Deduplication has also an impact on the write IO performance of the vDisk. In case you know the data isn’t suited for deduplication then don’t turn it on. As we have large read caches, we only set the dedupe feature on for OS disks.
Another setting we typically set at the vPool level is the SCO size. To increase the write performance you typically want to select a large Storage Container Object (SCO) size as to minimize the overhead of the creation and closing of a SCO. Also, backends are typically very good at writing large chunks of sequential data so a big SCO size makes sense. But as usual there is a trade-off. With traditional backends like Swift, Ceph or any other object store for that matter, you need to retrieve the whole SCO from the backend in case of a cache miss. A bigger SCO means in that case more read latency in case of a cache miss. This is one of the reasons why designed our own backend, ALBA. With ALBA you can retrieve a part of a SCO from the backend. Instead of getting a 64MiB SCO, we can get the exact 4k we need from the SCO. ALBA is the only object storage that currently supports this functionality. In large clusters with ALBA as backend we typically set 64 MiB as SCO size. In case you don’t use ALBA, use a lower SCO size.
Optimize the Backends
One of the more less obvious items which can make a huge difference in performance is the right preset. A preset consists out of a set of policies, a compression method (optional) and whether encryption should be activated.
You might ask why tuning the backend might influence the performance on the front-end towards the VM. The performance of the backend will for example influence the read performance in case of a cache miss. Also on writes the backend might become the bottleneck for incoming data. All writes go into the write buffer which is typically sized to contain a couple of SCOs. This is ok in case your backend is fast enough as once a SCO is full, it is ready to be saved on the backend and removed from the write buffer. This way it can make room for newly written data. In case the backend is too slow to manage what comes out of the write buffer, Open vStorage will start throttling the ingest of the data on the frontend. So it is essential to have a look at your backend performance in case it is the bottleneck for the write performance.
Since we typically set the SCO size to 64MiB and think a fragment size of 4MiB is a good size for fragments, we change the policy to have 16 data fragments. The other parameters are depending on the reliability and the amount of hosts used for storage.
Compression is typically turned on but gives distorted results when running your typical random IO benchmark as random data is hard to compress. Data which is hard to compress will even be bigger in size and hence take more time to store. Basically, if you are running benchmarks with random IO it is best to turn compression off.
In case you need help in tweaking the performance of your environment, feel free to contact us.