Best practice for VMware LUN size

I was asked this question today.  Its one of my favorite questions to answer but I’ve never wrote it down.  Today I did so here it is.  Let me know if you agree or if you have other thoughts.

For a long time VMware’s max LUN size was 2TB.  This restriction was not a big issue to many but some wanted larger LUN sizes because of an application requirement. In these cases it was typically one or only a few VM’s accessing the large datastore/LUN.  vSphere 5 took the LUN size limit from 2 TB to 64TB.  Quite a dramatic improvement and hopefully big enough to satisfy those applications.

For general purpose VMs, prior to vSphere 4.1, the best practice was to keep LUN sizes smaller than 2TB (i.e. even though ESX supports 2TB LUNs, don’t make them that big).  500GB was often recommended.  1TB was OK too.  But it really depended on a few factors.  In general, the larger the LUN the more VM’s it can support.  The reason for keeping the LUN sizes small in the past was to limit the number of VM’s per datastore/LUN.  The implication of putting too many VM’s on a datastore/LUN is that performance would suffer.  First reason is that vSphere’s native multipathing only leverages one path at a time per datastore/LUN.  So if you have multiple datastores/LUN’s then you can leverage multiple paths at the same time.  Or, you could go with EMC’s PowerPath/VE to better load balance the IO workload.  Second reason is with block storage for vSphere 4.0 and earlier there was a hardware locking issue.  This meant that if a VM was powered on, off, suspended, cloned,… then the entire datastore/LUN was locked until the operation was complete thus freezing out the other VM’s utilizing that same datastore/LUN.  This was resolved in vSphere 4.1 with VAAI Hardware Offload Locking assuming the underlying storage array supported the API’s.  But before VAAI, keeping the LUN sizes small helped administrators limit the number of VM’s on a single datastore/LUN thus reducing the effects of the locking and pathing issues.

OK, that was the history, now for the future.  The general direction for VMware is to go with larger and larger pools of compute, network, and storage.  Makes the whole cloud thing simpler.  Thus the increase of support from 2TB to 64TB LUN’s.  I wouldn’t recommend going out and creating 64TB LUN’s all the time.  Because of VAAI the locking issue goes away.  The pathing issue is still there with native multipathing but if you go with EMC’s PowerPath/VE then that goes away.  So then it comes down to how big the customer wants to make their failure domains.  The thinking is that the smaller the LUN the less VM’s placed on it thus the less impact if a datastore/LUN were to go away.  Of course we go through great lengths to prevent that with five 9’s arrays and redundant storage networks, etc.  So, the guidance I’ve been seeing lately is 2TB datastores/LUNs is a good happy medium of not too big and not too small for general purpose VM’s.  If the customer has specific requirements to go bigger then that’s fine, it’s supported.

So, in the end, it depends!!!

Oh, and the storage array behavior does have an impact on the decision.  In the case of an EMC VNX, assuming a FAST VP pool then the blocks will be distributed across various tiers of drives.  If more drives are added to the pool then the VNX will rebalance the blocks to take advantage of all the drives.  So whether it’s a 500GB LUN or 50TB LUN, the VNX will balance the overall performance of the pool.  Lots of good info here about the latest Inyo release for VNX:

http://virtualgeek.typepad.com/virtual_geek/2012/05/vnx-inyo-is-going-to-blow-some-minds.html

VMware vSphere 5 VAAI support for EMC CX4

In vSphere 4 the VAAI test harness only included functionality.  So if a storage array supported the VAAI primitives and passed VMware’s functionality test then VAAI was listed as a feature of the supported array in the Storage/SAN Compatibility Guide.  vSphere 5 added a VAAI performance test.  This is due to some of the issues it discovered when it released their Thin Provision Reclaim VAAI feature in vSphere 5.

EMC’s CX4 did not pass VMware’s performance test harness for XCOPY/Block Zero.  Atomic Test and Set (ATS) Hardware Offloading did pass the performance testing but since the XCOPY/Block Zero didn’t pass then VMware considers all VAAI as unsupported on the CX4.

Chad Sakac (Virtual Geek) lays it all out at the end of the PPT and recording in the post VNX engineering update, and CX4/VAAI vSphere scoop.  In that he proposes the following EMC support model:

  • Running with VAAI block acceleration on a CX4 with ESX5 is considered an unsupported configuration by VMware.
  • EMC will support the use of VAAI block under ESX5 with CX4 arrays.
  • If customers running this configuration have an issue, EMC recommends they turn off the VAAI features. If the condition persists, VMware will accept the case. If the problem is no longer occurring, contact EMC for support.

vSphere 4 did not have a performance test harness.  So if they were happily running vSphere 4 with VAAI enabled then they can upgrade to vSphere 5, leave VAAI enabled, and likely enjoy the same experience they have now for their CX4 on vSphere 4.