Configuring HP Smart Array P420i I/O Controller for VSAN

I’ve been working with many customers over the last several months and found that many are very familiar with HP hardware and just know how to set things up.  Others are looking for guidance from VMware on how to configure for VSAN.  There are things I’ve discovered that might not be obvious but can help in the VSAN setup.  Bear in mind, I am not an HP server hardware expert, so your comments are greatly appreciated.

Before I go too far, there is a bug in the HP async controller driver for the HP 420i that is included in the HP ESXi image.  The bug reduces the queue depth to 28, instead of 1020, causing poor performance in VSAN.

Here’s how to check your hosts IO Controller (storage adapter) queue depth:

  • Run the esxtop command on the ESXi shell / SSH session
  • Press d
  • Press f and select Queue Stats (d)
  • The value listed under AQLEN is the queue depth of the storage adapter

To resolve, follow these directions to implement the correct driver:

HP ProLiant Smart Array Controller Driver for VMware vSphere 5.5 (VIB file)

OK, a little background/overview on I/O Controller guidance for VSAN.  In general, VSAN recommends disabling Read and Write cache for any I/O Controller.  Since VSAN handles Read and Write caching at the software layer, there’s no need to do it at the hardware level.  Also, when destaging write cache, we want to ensure that the writes are committed to disk and not in I/O Controller cache.

In the case of the HP P420i, you cannot disable the I/O Controller cache so VSAN recommends setting it to 100% Read which essentially disables Write cache.  I recently discovered that you can also selectively pick and choose which disks to enable cache for.

Continue reading “Configuring HP Smart Array P420i I/O Controller for VSAN”

Virtual SAN Disaster Recovery – vSphere Replication (available now) or Virtual RecoverPoint (coming soon), choose your protection!

I’m often asked how to protect Virtual SAN (VSAN). Its simple, any product focused on protecting a virtual machine (VM) will work for protecting VM’s sitting on a VSAN enabled vSphere cluster. VMware offers VDP/VDPA for backup & recovery and there are many other VMware partners with backup & recovery solutions focused on protecting VM’s. Backup & Recovery is a great way to protect data but some customers like the benefit of more granular recovery points that comes from data replication either locally or to a disaster recovery site.

To protect VSAN data in a primary site to a remote disaster recovery site VMware offers vSphere Replication (VR) to replicate the VM data sitting on a VSAN Datastore over the DR site. Of course Site Recovery Manager (SRM) is supported to automate failover, failback and testing. The VR/SRM combined solution can also be used for planned data center migrations. Here are a few great write-ups on the topics:

VMware Virtual SAN Interoperability: vSphere Replication and vCenter Site Recovery Manager

Virtual SAN Interoperability – Planned migration with vSphere Replication and SRM

VSAN and vSphere Replication Interop

One of the main benefits of VR is that it will work to replicate VM data on any storage to another site with hosts connected to any other storage. So, VSAN can be the source, the target, or both.

VSAN & VR

 

vSphere Replication can be set to asynchronously replicate every day, hour, or up to every 15 minutes. Thus providing a Recovery Point Objective (RPO) of up to 15 minutes. For many customers, this is “good enough”. For some customer workloads, asynchronous replication is not “good enough”. They need synchronous replication protection and there are several solutions in the market. One that I’ve been a big fan of for a long time is EMC’s RecoverPoint which has a great reputation for protecting enterprise mission critical data and applications.  Essentially it splits every write transaction, journals it, and synchronously makes a copy of it either locally or to a remote DR site without impacting application performance. Of course there are more details but this is essentially what it does which results in being able to recover back to any point in time. Often it’s labeled as “Tivo or DVR for the data center”. One other benefit of RecoverPoint is it can replicate data from any storage to any storage, as long as there is a splitter for the storage. EMC VNX and VMAX storage arrays have splitters built in.

The big news that just came out last week that peeked my interest is that EMC is now offering a Beta of a completely software based RecoverPoint solution that embeds the splitter into vSphere. This brings the RecoverPoint benefits to any VMware customer running VM’s on any storage: block, file, or of course even VSAN. The EMC initiative is call Project Mercury and for more information check out:

Summer Gift Part 1 – Project Mercury Beta Open!

I’m excited that VSAN customers will have a choice for data protection, asynchronously with 15 minute RPO using vSphere Replication or continuous, synchronous, and asynchronous with EMC’s Virtual RecoverPoint.

Montreal Loves VSAN!

Last week I had the good fortune to support the Montreal VMware vForum.  There were over 418 participants and 21 partner booths.  A packed house at the Hilton Montreal Bonaventure which was a great facility.

MontrealForum1

There were multiple keynote presentations throughout the day as well as break out sessions on a wide variety of topics.  In the morning session I was able to share the benefits of VSAN to the entire crowd and let everyone know about the Hands on Lab we setup for attendees to try out VSAN.

 

We setup 10 Chromebook workstations that were occupied the whole day.  A total of 86 customers took the VSAN lab and the feedback was overwhelmingly positive.  Both about VSAN and the fact that we made the labs available during the day.

MontrealForumVSANlab

At the end of the day there was an after party during which we gave away the Chromebooks to lucky winners while everyone was enjoying their favorite beverage.

A special thanks to our VMware friends, partners, and especially customers for helping make this a great day!  Montreal is a great city and now we know Montreal Loves VSAN!

I look forward to the next big event: Boston VMUG User Conference.

What is the RAW to Usable capacity in Virtual SAN (VSAN)?

I get asked this question a lot so in the spirit of this blog it was about time to write it up.

The only correct answer is “it depends”. Typically, the RAW to usable ratio is 2:1 (i.e. 50%). By default, 1TB RAW capacity equates to approximately 500GB usable capacity. Read on for more details.

In VSAN there are two choices that impact RAW to usable capacity. One is the protection level and the other is the Object Space Reservation (%). Lets start with protection.

Virtual SAN (VSAN) does not use hardware RAID (Disclaimer at the end). Thus, it does not suffer the capacity, performance, or management overhead penalty of hardware RAID. The raw capacity of the local disks on a host are presented to the ESXi hypervisor and when VSAN is enabled in the cluster the local disks are put into a shared pool that is presented to the cluster as a VSAN Datastore. To protect VM’s, VSAN implements software distributed RAID leveraging the disks in the VSAN Datastore. This is defined by setting policy. You can have different protection levels for different policies (Gold, Silver, Bronze) all satisfied by the same VSAN Datastore.

The VSAN protection policy setting is “Number of Failures to Tolerate (#FTT) and can be set to 0, 1, 2, 3. The default is #FTT=1 which means using distributed software RAID there will be 2 (#FTT+1) copies of the data on two different hosts in the cluster. So if the VM is 100GB then it takes 200GB of VSAN capacity to satisfy the protection. This would be analogous to RAID 1 on a storage array. But rather than writing to a disk then to another disk in the same host we write to another disk on another host in the cluster. With #FTT=1, VSAN can tolerate a single SSD failure, a single HDD failure, or a single host failure and maintain access to data. Valid settings for #FTT are 0, 1, 2, 3. If set to 3 then there will be 4 copies of the VM data thus RAW to usable would be 25%. In addition, there is a small formatting overhead (couple of MB) on each disk but is negligible in the grand scheme of things.

#FTT # Copies
(#FTT+1)
RAW-to-usable Capacity %
0 1 100%
1 2 50%
2 3 33%
3 4 25%

Perhaps you create the following policies with the specified #FTT:

  • Bronze with #FTT=0 (thus no failure protection)
  • Silver policy with #FTT=1 (default software RAID 1 protection)
  • Gold policy with #FTT=2 (able to maintain availability in the event of a double disk drive failure, double SSD failure, or double host failure)
  • Platinum policy with #FTT=3 (4 copies of the data).

Your RAW to useable capacity will depend on how many VM’s you place in the different policies and how much capacity each VM is allocated and consumes. Which brings us to the Object Space Reservation (%) discussion.

In VSAN, different policy can have different Object Space Reservation (%) (Full Provisioned percentages) associated with them. By default, all VM’s are thin provisioned thus 0% reservation. You can choose to fully provision any % up to 100%. If you create a VM that is put into a policy with Object Space Reservation equal to 50% and give it 500GB then initially it will consume 250GB out of the VSAN Datastore. If you leave the default of 0% reservation then it will not consume any capacity out of the VSAN Datastore but as data is written it will consume capacity per the protection level policy defined and described above.

That ended up being a longer write up than I anticipated but as you can see, it truly does depend. I suggest sticking to the rule of thumb of 50% RAW to usable. But if you are looking for exact RAW to usable capacity calculations you can refer to the VMware Virtual SAN Design and Sizing Guide found here. https://blogs.vmware.com/vsphere/2014/03/vmware-virtual-san-design-sizing-guide.html
Also, you can check out Duncan Epping’s Virtual SAN Datastore Calculator: http://vmwa.re/vsancalc

Disclaimer at the end: ESXi hosts require IO Controllers to present local disk for use in VSAN. The compatible controllers are found on the VSAN HCL here: http://www.vmware.com/resources/compatibility/search.php?deviceCategory=vsan

These controllers work in one of two modes; passthrough or RAID 0. In passthrough mode the RAW disks are presented to the ESXi hypervisor. In RAID 0 mode each disk needs to be placed in its own RAID 0 disk group and made available as local disks to the hypervisor. The exact RAID 0 configuration steps are dependent on the server and IO Controller vendor. Once each disk is placed in their own RAID 0 disk group you will then need to login via SSH to each of your ESXi hosts and run commands to ensure that the HDD’s are seen as “local” disks by Virtual SAN and that the SSD’s are seen as “local” and “SSD”.

I hope this is helpful. Of course questions and feedback is welcome.

What does a 32 host Virtual SAN (VSAN) Cluster Look Like?

The big VMware Virtual SAN (VSAN) launch was today. Here are a couple of good summaries:

Cormac Hogan – Virtual SAN (VSAN) Announcement Review

Duncan Epping – VMware Virtual SAN launch and book pre-announcement!

The big news is that VSAN will support a full 32 hosts vSphere cluster. So what does that look like fully scaled up and out?

VSAN - 32 Hosts

By the way, for details on how VSAN scales up and out check: Is Virtual SAN (VSAN) Scale Up or Scale Out Storage…, Yes!.

2 Great Bootcamps coming up at VMware Partner Exchange – PEX 2014

SDDC3522-BC – Software-Defined Storage Technical Boot Camp

This session will be all day on Saturday 2/8/2014 starting at 8:30AM.  I will be presenting the SDDC and VSAN overview as well as the vSphere Flash Read Cache Technical Presentation.  The technical deep dive on VSAN will be presented by Wade Holmes and a few other guest speakers.  Wade has authored a couple of the Hardware Configuration guidance blogs on SSD and IO Controllers and has been feverishly testing VSAN in all sorts of configurations in preparation for product launch.  I’m excited about the technical depth that Wade will go into and know our partners will get a ton of good information out of this session.  In addition, one of our engineers, Joe Cook, has been working with a bunch of customers to implement VSAN.  He will share the processes he’s been using for Proof of Concepts as well as present how to monitor and troubleshoot VSAN.  As a bonus, he’ll share a new tool that we’ve developed to help our partners analyze customer environments in preparation for VSAN and other VMware technologies. 

3579-SPO – EMC’s Game Changing Solution Roadmaps, Resources & Partner Programs

This session will be all day Monday 2/10/2014 starting at 8:30AM.  Prior to my current position as a Software Defined Storage SE for VMware I was an EMC vSpecialist for almost 4 years.  So this session is near and dear to my heart.  I just sat through the EMC Elect PEX planning concall and saw the full agenda for this one.  Lots of great presenter’s including of course Chad Sakic, Jason Nash, Aaron Chaisson, Rob Peglar, Brian Whitman, and others.  Chad will kick things off but you’ll need to download the NDA form from the PEX schedule builder and bring the signed copy in order to get into the session.

Continue reading “2 Great Bootcamps coming up at VMware Partner Exchange – PEX 2014”