A VDI boot storm will eventually end up costing you time, money, or aggravation. Boot storms are more prevalent in this brave new world of desktop virtualization than one would like to think. With the compounding effect of running tens to hundreds of virtual machine desktops on a single piece of bare-metal hardware that has been virtualized, your storage is going to stay real busy keeping all of those virtual machines happy.
The Problem
First, a little background on why boot storms even exist. When sizing storage for both performance and capacity to serve in a VDI (Virtual Desktop Infrastructure) environment, the steady-state of virtual machine operation dictates the sizing requirements. What happens when all of your users come in on a Monday morning and power on their VDI terminals?
It certainly won’t look like “steady-state” operation to your storage. Instead, you potentially end up with hundreds or thousands of virtual desktops running on a farm of Virtual Machine Hypervisor servers (i.e. VMware ESXi) all trying to read their operating systems from their Virtual Machine Disk (.vmdk) files all at once. Most modest NAS storage devices will leverage their RAM memory cache as much as possible and then rely on the number of storage spindles to serve the cache-miss IOPS. With enough cache-misses, the disks will start to run incredibly hot, leading to queuing, which leads to longer boot times, which leads to user complaints that their virtualized desktops take forever to boot.
One way to think about this problem is to consider the pre-virtualization landscape: Every user had their own desktop hardware with dedicated CPU, RAM, and storage. A desktop computer would generally take 2-5 minutes to boot, depending on the operating system environment and machine capabilities. In this scenario, most of the desktop boot time was spent by the CPU waiting for I/O from the hard-disk to provide the required data to load the kernel, device drivers, graphical user interface environment, etc. With a single SATA hard-disk capable of cranking out 250 IOPS while still delivering a satisfactory boot time, the machine ends up wasting a lot of CPU time “waiting.” After the desktop is booted and required applications are running, there are only occasional requests for disk I/O. This can be considered steady-state operation. The same principle behind virtualization applies here: why waste a physical resource by leaving it idle or unused?
As server virtualization technology has evolved, one can now leverage all that CPU time that was spent waiting for disk I/O to be utilized by another virtual machine that is hosted on the same piece of virtualized hardware. The beauty of this is that the end-user’s boot-time experience is no different, except now, you can have multiple virtual machines sharing to maximize the utilization of the available hardware resources by virtualizing RAM, storage and networking on an ESXi host. Sharing is wonderful and great, up until the point that everyone needs their fair-share of CPU/RAM/Disk resources, all at the same time.
This shouldn’t be news to anyone, as this is the foundation upon which the virtualization craze has so successfully built upon: maximizing resource utilization for efficiency. So all you need to do now is bring up an ESXi server that has 16 CPU cores, 128GB of memory, 10Gigabit ethernet, and you can host fifty or one-hundred virtual machines depending on their requirements. With 100 virtual machines running on a ESXi server, you’re going to need some storage for all these guys.
Shared SAN or NAS storage is the way to go here, to leverage cool features and tools hypervisors like VMware vCenter offer, to ease the pain of managing thousands of virtual machines. Thin-provisioning and sparse Datastores allow you to over-provision your storage (another big win,) so you end up buying a 20TB NAS with 20 spindles of 7200RPM SATA drives. That’ll give you a solid 5,000 IOPS to handle all of your I/O, should be fine right?
If you’re running 100 VMs against that, sure, you get 50 IOPS per VM which should be more than enough for shared steady-state. When you grow to 400 VMs, now you’re contending with 12.5 IOPS for each VM, which some may still consider acceptable for steady-state operation. When one of those ESXi servers hosting 100 VMs needs to be rebooted, all the guest VMs are going to be hungry for their own share of 250 IOPS to achieve boot times in the 2-minute range.
Let’s do the napkin math: 100 VMs X 250 IOPS/VM = 25,000 IOPS to achieve the status-quo boot time of 2 minutes. With a storage array that can handle 5,000 IOPS, you’re looking at boot times that are 5x longer, i.e. 10 minutes. The other 300 users that are on other ESXi servers sharing the same storage are now unable to get their ops through fairly either. So everyone ends up suffering.
Welcome to a boot storm. Those VMs that must reboot end up waiting 10 times longer, and those steady-state VMs that share the storage are going to be pretty much dead in the water for the next 10 minutes. Time? Aggravation? You bet!
The Test
Below, is a chart showing the results of up to 48 virtual machines booting up simultaneously. The test environment was pretty straightforward: Measure the amount of time that Windows 7 virtual machines, each with 2GB of memory and 16GB of virtual disk space, take to reach a point where they’re ready to launch an application. We gathered numbers as progressively greater numbers of virtual machines were booting up in parallel.
We tested against against two different NetApp filers. An entry-level NetApp filer (FAS2050) with 12 spindles of SATA 7200rpm disk and a mid-range NetApp filer (FAS3240) with 12 spindles of SATA 7200rpm disk. The entry-level NetApp is typically the filer that a customer would choose when starting out small. However as one’s virtualized environment starts to grow, this platform will often prove to be inadequate. The FAS2000 series is not upgradeable with a FlashCache card, so the next step up would be to go with a bigger filer that has more CPU and RAM like the FAS3240. Lastly, the option to add an Avere FXT2550 to an entry-level NAS environment was tested. The numbers speak for themselves:

The Results
A single virtual machine booting up, took about 38 seconds flat, across all storage platforms that were tested. The NetApp FAS2050 started to show signs of degrading boot times once 16 or more VMs were booting up simultaneously. At this load point of 16 VMs, boot times had increased by about 113% over the baseline boot time of 38-seconds provided by the FAS2050. Once simultaneously booting up 48 virtual machines, the boot times grow to 246-seconds, or about 550% higher.
With no option to add a NetApp FlashCache to the FAS2000 series, the next step is a forklift upgrade to a FAS3240, which is a mid-range platform that has more room for expansion. However, upgrading the Filer head alone will not yield a noticeable improvement: booting 16 VMs on the FAS3240 only took about 107% more time than the baseline single-VM boot. So now you’ve spent a good chunk of change on a brand-new filer that hasn’t really improved things significantly from the boot-storm mitigation perspective.
Instead, take that forklift-upgrade budget and roll in an Avere FXT 2550 in front of the FAS2050, and solve the boot storm problem. With an FXT 2550 in place for the 48 VM boot storm, testing revealed that boot times stay below 70-seconds. This is almost four times lower than the boot time for 48 VMs using the FAS3240 Filer. Additional testing has shown that the Avere FXT 2550 is capable of handling 145 VMs simultaneously booting before a reaching a saturation point where boot times grew larger than 90-seconds. Even then, once you’ve saturated a single FXT node, you can simply add more nodes to the FXT cluster and scale out your boot storm coverage with FXT nodes as incremental building blocks. The Avere architecture assures that the performance for this workload can scale out as you add more FXT nodes to the cluster, while maintaining the simplicity of managing a single filer.
