Avere Brings Scalable NAS to StorNext (a.k.a. XSAN) Infrastructures

It’s been several weeks since NAB 2012 and I’ve had time to collect my thoughts. One of the big takeaways from the show for me is how Avere can help users of the StorNext SAN file system from Quantum. (Note: Apple resells StorNext under the name XSAN so my comments apply equally to this product as well.)

StorNext specializes in large sequential reads and writes and works well for media applications such as video editing and play-out. On the other hand, rendering which requires very high IO/sec rates to many small files and is not a good fit for StorNext. That’s where Avere comes in. Many of the big rendering shops (e.g. Sony Imageworks, Digital Domain) and smaller ones (e.g. Whiskytree) are using Avere to scale NFS performance for their render jobs.

What makes the combined Avere and StorNext solution so compelling is that you get the best of NAS and SAN combined in a solution where there is only one copy of the data.

StorNext is typically run in a SAN environment, including SAN storage, FC switches, and workstations with FC HBAs. StorNext supports a “NAS gateway” which provides NAS access to the SAN. Typically this is implemented as a Linux server that has both an FC HBA and the StorNext software client to access the SAN along with an NFS stack and a GigE NIC for NAS access.

Whiskytree was using this config and running their renders directly on the Linux NFS gateway. However, once they got to 40 render nodes, they maxed out the performance of the Linux NFS gateway. Whiskytree purchased a two-node FXT cluster from Avere to accelerate the NFS performance and now they are up to 150 nodes in their render farm. See the below picture and the case study on our website for more details.

At NAB, I was telling the Whiskytree story to the CIO of a large animation studio and he got a big smile on his face. He explained to me that today his environment has large NAS and SAN infrastructures that are completely disconnected. This is a big source of inefficiency for him because there are windows of time during the day that must be reserved to copy the data between the NAS-only and SAN-only sides, not to mention all the duplicated disks to store the same data on both sides.

So, this is a shout-out to all users of StorNext and XSAN. If you are looking to add scalable NAS to your environment and extend the life of your existing SAN, take a look at Avere.

Introducing FlashMove™ and FlashMirror™

Today Avere introduced FlashMove™ and FlashMirror™ software, enabling better data management and protection in heterogeneous NAS environments. Prior to today, the best software I’ve been involved with in my professional career has been SpinMove from Spinnaker Networks and SnapMirror from NetApp. I’m excited since FlashMove and FlashMirror provide the functionality of these great products and more.

Many readers won’t remember SpinMove since it was only on the market for about a year before NetApp bought Spinnaker. SpinMove enabled admins to non-disruptively move volumes between storage nodes in a Spinnaker cluster. Of all the innovation that Spinnaker brought to market, SpinMove generated the most customer interest. Many customers bought Spinnaker storage for this feature alone.

Avere’s FlashMove is a lot like SpinMove in that it allows data to be non-disruptively moved between NAS systems but has two significant advantages. First, FlashMove is integrated with Avere’s native tiering and this means you get transparent data migrations *and* performance acceleration in the same solution. Second, FlashMove works with heterogenous NAS solutions so there is no lock-in to a single brand of storage.

Check out the FlashMove data sheet for more info.

NetApp’s SnapMirror, on the other hand, has become the gold standard for asynchronous mirroring to provide a disaster recovery (DR) solution. NetApp has a lot of great products and this is probably their best. Unfortunately, not everyone gets to use SnapMirror since it only works on NetApp storage.

Avere’s FlashMirror has several significant advantages over SnapMirror. First, FlashMirror is more efficient and keeps replicated data more closely in sync by sending updates directly and in parallel to both the primary and secondary NAS filers. Second, FlashMirror offloads the replication-processing load from the storage and supports clustering to scale replication performance to any level required. Third, FlashMirror is simple to install in existing environments and works with heterogeneous NAS solutions.

Check out the FlashMirror data sheet for more info.

FXT Series Boot Storm Test Results

A VDI boot storm will eventually end up costing you time, money, or aggravation. Boot storms are more prevalent in this brave new world of desktop virtualization than one would like to think. With the compounding effect of running tens to hundreds of virtual machine desktops on a single piece of bare-metal hardware that has been virtualized, your storage is going to stay real busy keeping all of those virtual machines happy.

The Problem

First, a little background on why boot storms even exist. When sizing storage for both performance and capacity to serve in a VDI (Virtual Desktop Infrastructure) environment, the steady-state of  virtual machine operation dictates the sizing requirements. What happens when all of your users come in on a Monday morning and power on their VDI terminals?

It certainly won’t look like “steady-state” operation to your storage. Instead, you potentially end up with hundreds or thousands of virtual desktops running on a farm of Virtual Machine Hypervisor servers (i.e. VMware ESXi) all trying to read their operating systems from their Virtual Machine Disk (.vmdk) files all at once. Most modest NAS storage devices will leverage their RAM memory cache as much as possible and then rely on the number of storage spindles to serve the cache-miss IOPS. With enough cache-misses, the disks will start to run incredibly hot, leading to queuing, which leads to longer boot times, which leads to user complaints that their virtualized desktops take forever to boot.

One way to think about this problem is to consider the pre-virtualization landscape: Every user had their own desktop hardware with dedicated CPU, RAM, and storage. A desktop computer would generally take 2-5 minutes to boot, depending on the operating system environment and machine capabilities. In this scenario, most of the desktop boot time was spent by the CPU waiting for I/O from the hard-disk to provide the required data to load the kernel, device drivers, graphical user interface environment, etc. With a single SATA hard-disk capable of cranking out 250 IOPS while still delivering a satisfactory boot time, the machine ends up wasting a lot of CPU time “waiting.” After the desktop is booted and required applications are running, there are only occasional requests for disk I/O. This can be considered steady-state operation. The same principle behind virtualization applies here: why waste a physical resource by leaving it idle or unused?

As server virtualization technology has evolved, one can now leverage all that CPU time that was spent waiting for disk I/O to be utilized by another virtual machine that is hosted on the same piece of virtualized hardware. The beauty of this is that the end-user’s boot-time experience is no different, except now, you can have multiple virtual machines sharing to maximize the utilization of the available hardware resources by virtualizing RAM, storage and networking on an ESXi host. Sharing is wonderful and great, up until the point that everyone needs their fair-share of CPU/RAM/Disk resources, all at the same time.

This shouldn’t be news to anyone, as this is the foundation upon which the virtualization craze has so successfully built upon: maximizing resource utilization for efficiency. So all you need to do now is bring up an ESXi server that has 16 CPU cores, 128GB of memory, 10Gigabit ethernet, and you can host fifty or one-hundred virtual machines depending on their requirements. With 100 virtual machines running on a ESXi server, you’re going to need some storage for all these guys.

Shared SAN or NAS storage is the way to go here, to leverage cool features and tools hypervisors like VMware vCenter offer, to ease the pain of managing thousands of virtual machines. Thin-provisioning and sparse Datastores allow you to over-provision your storage (another big win,) so you end up buying a 20TB NAS with 20 spindles of 7200RPM SATA drives. That’ll give you a solid 5,000 IOPS to handle all of your I/O, should be fine right?

If you’re running 100 VMs against that, sure, you get 50 IOPS per VM which should be more than enough for shared steady-state. When you grow to 400 VMs, now you’re contending with 12.5 IOPS for each VM, which some may still consider acceptable for steady-state operation. When one of those ESXi servers hosting 100 VMs needs to be rebooted, all the guest VMs are going to be hungry for their own share of 250 IOPS to achieve boot times in the 2-minute range.

Let’s do the napkin math: 100 VMs X 250 IOPS/VM = 25,000 IOPS to achieve the status-quo boot time of 2 minutes. With a storage array that can handle 5,000 IOPS, you’re looking at boot times that are 5x longer, i.e. 10 minutes. The other 300 users that are on other ESXi servers sharing the same storage are now unable to get their ops through fairly either. So everyone ends up suffering.

Welcome to a boot storm. Those VMs that must reboot end up waiting 10 times longer, and those steady-state VMs that share the storage are going to be pretty much dead in the water for the next 10 minutes. Time? Aggravation? You bet!

The Test

Below, is a chart showing the results of up to 48 virtual machines booting up simultaneously. The test environment was pretty straightforward: Measure the amount of time that Windows 7 virtual machines, each with 2GB of memory and 16GB of virtual disk space, take to reach a point where they’re ready to launch an application. We gathered numbers as progressively greater numbers of  virtual machines were booting up in parallel.

We tested against against two different NetApp filers. An entry-level NetApp filer (FAS2050) with 12 spindles of SATA 7200rpm disk and a mid-range NetApp filer (FAS3240) with 12 spindles of SATA 7200rpm disk. The entry-level NetApp is typically the filer that a customer would choose when starting out small. However as one’s virtualized environment starts to grow, this platform will often prove to be inadequate.  The FAS2000 series is not upgradeable with a FlashCache card, so the next step up would be to go with a bigger filer that has more CPU and RAM like the FAS3240. Lastly, the option to add an Avere FXT2550 to an entry-level NAS environment was tested. The numbers speak for themselves:

The Results

A single virtual machine booting up, took about 38 seconds flat, across all storage platforms that were tested. The NetApp FAS2050 started to show signs of degrading boot times once 16 or more VMs were booting up simultaneously. At this load point of 16 VMs, boot times had increased by about 113% over the baseline boot time of 38-seconds provided by the FAS2050. Once simultaneously booting up 48 virtual machines, the boot times grow to 246-seconds, or about 550% higher.

With no option to add a NetApp FlashCache to the FAS2000 series, the next step is a forklift upgrade to a FAS3240, which is a mid-range platform that has more room for expansion. However, upgrading the Filer head alone will not yield a noticeable improvement: booting 16 VMs on the FAS3240 only took about 107% more time than the baseline single-VM boot. So now you’ve spent a good chunk of change on a brand-new filer that hasn’t really improved things significantly from the boot-storm mitigation perspective.

Instead, take that forklift-upgrade budget and roll in an Avere FXT 2550 in front of the FAS2050, and solve the boot storm problem. With an FXT 2550 in place for the 48 VM boot storm, testing revealed that boot times stay below 70-seconds. This is almost four times lower than the boot time for 48 VMs using the FAS3240 Filer. Additional testing has shown that the Avere FXT 2550 is capable of handling 145 VMs simultaneously booting before a reaching a saturation point where boot times grew larger than 90-seconds. Even then, once you’ve saturated a single FXT node, you can simply add more nodes to the FXT cluster and scale out your boot storm coverage with FXT nodes as incremental building blocks. The Avere architecture assures that the performance for this workload can scale out as you add more FXT nodes to the cluster, while maintaining the simplicity of managing a single filer.

Avere’s World Record NFS Performance in the Industry’s Smallest Footprint

I’m blogging from the floor of the SC11 Exhibit Hall where we just broke out the free Red Bull performance drinks and unveiled a new booth backdrop to celebrate our new World Record. Stop by booth 442 if you are at the show.

Avere Systems shocked the storage world today and took the top NFS performance spot on the SPECsfs2008 benchmark, taking down the big dogs, NetApp and EMC/Isilon, in the process. Avere posted throughput of 1,564,404 ops/sec, which is the highest ever posted in the long history of the NFS benchmark. In addition, this throughput was achieved with an ORT (overall response time or latency) of just 0.99 msec, which is 35% better than NetApp’s best and 61% better than EMC’s best.

For more details on the performance tests by the three vendors, here are links to the posted SPECsfs2008 results from Avere, NetApp, and EMC/Isilon.

The performance that Avere demonstrated is impressive but it is only half the story. Even more impressive is the efficiency with which Avere delivered the results.

Avere delivered higher performance and lower latency with a system that costs dramatically less, both in terms of the capital expenses to purchase the system and operating expenses for space, power, cooling, etc.

I will take you through the numbers in a second but first let’s take a look at pictures that compare the storage systems that were tested by Avere, NetApp, and EMC/Isilon.

As you can see, Avere packs the highest performance and lowest latency into a package that is 79% smaller than NetApp and 65% smaller than EMC/Isilon.

Overall size is an approximate measure of the capital and operating expenses. Let’s dig deeper into the actual numbers. In the below table I have included the pertinent comparison data. As you can see from the below, Avere is 51-77% less cost, requires 56-78% fewer disks, and occupies 64-76% fewer rack units.

SPECsfs2008 does not measure power or cooling requirements. In a storage system, the disk drives are the largest consumers of power and dissipaters of heat. Therefore, a good estimate for the power and cooling savings is the disk savings, where Avere is 56-78% less.

Prior to Avere’s SPECsfs2008 posting, NetApp and EMC/Isilon waged a war of words contrasting their SPECsfs2008 results in the body and comments of this blog. It’s a highly recommended read. Make sure to read the comments. With the Avere results now out, the NetApp and EMC/Isilon battle is over 2nd and 3rd place, with Avere taking 1st in all the major categories.

Hope to see you at SC11.

SPEC® and the benchmark name SPECsfs®2008 are registered trademarks of the Standard Performance Evaluation Corporation. Competitive benchmark results stated above reflect results published on www.spec.org as of Nov 15, 2011. Above we compare all SPECsfs2008_nfs.v3 results that achieved 1,000,000 ops/sec throughput or higher. For the latest SPECsfs2008 benchmark results, visit http://www.spec.org/sfs2008.

NAS Optimization for the Cloud

There’s lots of buzz in the storage industry about the cloud. To date, however, the cloud has been impractical for most primary applications because the high-latency WAN connection between the cloud providers and the cloud clients has resulted in poor performance. That’s where Avere comes in…

Avere’s NAS Optimization enables using the cloud for primary applications. The Avere FXT Series uses intelligent tiering to automatically store active data near the client, eliminating the latency of the WAN. Customers are using our cloud solution in four data access scenarios:

1) Remote Office to Datacenter: Data storage is consolidated in a centralized datacenter and Avere is used at the edge to provide low-latency access to remote users.

2) Datacenter to Datacenter: Avere enables compute resources to be shared across multiple datacenters by automatically placing the data actively being processed near the compute nodes and eliminating the WAN latency between the datacenters.

3) Enterprise to Compute Cloud: Enterprises are deploying lower cost compute infrastructures, both in private cloud and public cloud models, by co-locating Avere clusters with the compute nodes to automatically tier and store the active data.

4) Enterprise to Storage Cloud: Avere enables storage clouds for primary applications by placing Avere clusters near the clients, whether in a datacenter, a remote office, or a cloud compute facility, to automatically tier and store the data that the client is actively using.

Check out our cloud solution brief for more.

Enterprise-wide NAS Cloud

NAS Optimization for VMware

VMware is everywhere and I’m very excited about the traction we’re seeing here. Across software build, database, virtual desktop, and other guest applications, customer are finding great value in placing an intelligent read/write caching tier in front of their existing NAS systems. Here are some highlights I’ve heard from customers.

• VMware is a write-heavy app and Avere’s write caching provides a huge performance boost.
• Avere block-level caching efficiently uses SSD/SAS tiers of FXT clusters, especially when using VMware Linked Clones.
• Avere’s NAS Optimization is faster than SAN and simpler to manage and scale.
• The Avere user interface provides great visibility into VMware operations, including ESX hosts, VMs, and VMDKs.
• Storage VMotion makes adding FXT clusters in front of existing NAS systems simple and non-disruptive.
• Storing VMDKs on inexpensive & high-density SATA and accelerating with Avere is much cheaper than storing all VMDKs on SAS/FC.

Check out our VMware solution brief for more.

Avere is VMware Ready certified.

Global Namespace and the Path to NAS 2.0

In the little over a year that we have been shipping our FXT Scale-out NAS Appliance, we have received very positive feedback on our product and its ability to scale NAS performance.  Performance scaling is the result of both our Tiered File System (TFS), which dynamically allocates frequently used blocks to faster storage media, and our clustering technology which clusters up to 25 appliances together to linearly scale performance.

Our primary objective has been to increase NAS efficiency by off-loading filer operations and facilitating the use of high density, low power media for mass bulk storage.  It is common for our customers to perform the same or more processing as traditional NAS with 1/5th of the data center resources (rack space, power & cooling).

Our customers typically characterize our product as the “user or client facing side of NAS” and the traditional filer as the “archive or data management part”.  The most frequent request from our customers has been “Now that you implement the client facing side of NAS, can you do something to help with our NAS clutter?”

NAS Clutter

Storage administrators traditionally have scaled their environment first by adding expensive, power hungry and low density performance disks behind a single filer until that filer becomes over loaded, and second by adding more filers to their environment.  Over time, this results in NAS clutter – a multi-filer environment in which the user or client machines must be aware of (and re-configured with) any changes in the storage environment.
http://averesystems.filhttp://averesystems.files.wordpress.com/2011/03/nas1-0clutter.png?w=400&h=290&h=200
Click here for the full-sized image

NAS clutter is the result of the current NAS 1.0 architecture that was not built to scale performance or handle the challenges of geographically distributed users.  The NAS filer is the single bottleneck in the NAS environment – all operations from all users must transit the filer, much like single CPU processors were the bottleneck in computers until the advent of multi-core architectures.  The NAS 1.0 architecture worked well a decade ago but has severe limitations today.

http://averesystems.filhttp://averesystems.files.wordpress.com/2011/03/nas1-0architecture2.png?w=400&h=290&h=200
Click here for the full-sized image

Towards NAS 2.0

In our product announcement last week, we introduced global namespace, or GNS, functionality to the FXT product line.  Using GNS our customers can now create a single logical namespace in the FXT cluster, which is visible to all clients that mount any FXT Appliance.  The storage administrator can then configure any export on any filer in their data center to be a sub-tree within that namespace.  A single common view is presented to all users or clients, effectively eliminating NAS clutter.
http://averesystems.filhttp://averesystems.files.wordpress.com/2011/03/nas2-0.png?w=400&h=290&h=200
Click here for the full-sized image

Global namespace and the virtualization of storage resources is an important building block for scaling out the NAS architecture.  When you combine global namespace with Avere’s dynamic media tiering and scale-out clustering you have the genesis of NAS 2.0:

  • Global namespace removes NAS clutter from the user view – separating the client facing NAS services from datacenter administration.
  • Dynamic media tiering and scale-out clustering hide mass storage and WAN latency, facilitating the use of high density (low cost, low power) media and remote Cloud storage.

NAS 2.0 and Cloud Storage

NAS 2.0 provides the right combination of global namespace and performance scaling to finally make cloud storage a reality for enterprise applications.  Current cloud storage deployments are typically relegated to backup and data protection applications, due to the high latency to transit the WAN.  Because of that latency, enterprise application performance would suffer and live users would see unacceptably high latency to their data.

Avere’s performance scaling permits enterprise applications and end users to access remote storage with no degradation in performance over local storage.  The deployment model places an Avere FXT cluster near enterprise applications or end users.  Storage can be located anywhere.  The added benefit of GNS in this model is that storage components can be located at several locations with a single access point for all users at all locations – creating a single view of storage for distributed enterprises.  GNS effectively hides the additional clutter of multiple locations for these distributed enterprises.

In summary, GNS is a fundamental component of a NAS 2.0 architecture, whether its within the data center, in the cloud or a hybrid of both. In my next post, I’ll explore more fully how NAS 2.0 enables cloud access.


Storage Tiering is Tops in Storage Magazine’s 2011 Hot Technologies

With less than two weeks left to go until 2011, it’s time for publications to roll out their predictions for the coming year. Storage Magazine has put together a list of what its editors and experts will be the hottest technologies in storage in the coming year. Coming in at #1 is automated storage tiering in its debut on the hot technology annual list, ahead of cloud storage services, primary storage de-dupe and others. Why is tiering so hot? It’s all about the Flash and intelligent use of it:

“It was very difficult to be able to afford enough SSD if you were purely going to use it as a static storage device,” noted Mark Peters, a senior analyst at Milford, Mass.-based Enterprise Strategy Group (ESG). “Now that people will be able to combine tiering with a smaller amount of SSD, I think the two go hand in glove.”

The article goes on to give out sage advice on how to choose a tiering solution and separate hype from reality:

Product differentiators include the level of granularity at which the data moves between tiers, the degree of automation and the extent to which users can define policies.

Wait a minute – that criteria seems very, very familiar to us, it’s just missing the part about the ability to work in a heterogeneous environment.

At any rate, we’re glad that Storage Magazine recognizes how important dynamic storage tiering is in building out high performance, cost effective storage networks – we couldn’t agree more. Here’s to 2011!

SPECsfs2008 – A Year in Review

11Nov2010 Update: NetApp submitted SPEC’08 results this week on their new FAS6240 and achieved more than 120k ops/sec. They used 288 15k SAS drives to achieve this result so it is just more of the same, throwing disks at performance. They did use 1TB of their Flash Cache but this is read-only cache so they still need a lot of drives for the writes. I updated my table below.

It’s been a little more than a year since my first blog on Avere’s SPEC results and it’s a good time to take a look back to see what’s happened in the past twelve months. In my original blog I used the SPEC results to examine the top-performing NAS systems and compare the number of hard disk drives each requires to deliver the performance.

Why compare the number of hard disks? There are two reasons.

First, the number of disks is a good measure of the cost of a storage system, both the capex acquisition cost and the ongoing operational cost for power, cooling, and rack space. SPEC does not require posting the price of the NAS system under test and the number of disks used is the best way to approximate the price.

Second, there has been a lot of industry buzz around storage tiering, that is, placing data on the best storage media to optimize performance and cost. Comparing the number of disk drives used is a great way to see how each vendor is progressing in this area.

Back on October 12, 2009, the date of my original blog, there were four solutions that achieved roughly 120k ops/sec or better: Avere, Exanet, Huawei Symantec, and NetApp. Since then, six more solutions with results higher than 120k have been posted on the SPEC website. See the below table for the complete list. In the table, I also included the number of disks used by each vendor and calculated the ops/sec per disk used, with this last number being the best measure of performance per dollar delivered by the vendors.

Based on the above results, Avere delivers on average seven times more performance per disk used than the other vendors. Avere FXT appliances use intelligent tiering algorithms to separate performance scaling from capacity scaling and more efficiently deliver both.

Ray Lucchesi, President and Founder of Silverton Consulting, analyzed the SPEC results and reached similar conclusions in his blog last week. For more information, check out his RayOnStorage blog.

SPEC® and the benchmark name SPECsfs®2008 are registered trademarks of the Standard Performance Evaluation Corporation. Competitive benchmark results stated above reflect results published on www.spec.org as of Nov 1, 2010. Above we compare all SPECsfs2008_nfs.v3 results that achieved 120k ops/sec throughput or higher. For the latest SPECsfs2008 benchmark results, visit http://www.spec.org/sfs2008.

Part 4: Things to Consider Before Upgrading Your NAS

If you’re at all concerned with the scalability of your infrastructure when considering upgrades, you should know that adding new filers, more high speed drives and/or Flash modules to a NAS installation to improve performance is a short-term solution at best. It’s only a matter of time before application demands once again outstrip the NAS infrastructure’s ability to scale performance and you’re back to ripping out old gear and replacing it with new. In contrast, with Avere’s two stage NAS architecture, system scalability is built in. As more clients and new applications are added to the mix (requiring higher IOPS performance) an Avere FXT cluster can be easily expanded with the non-intrusive addition of new nodes. Up to 25 appliances can be added to a cluster, delivering plenty of horsepower without having to touch any other devices already in place. And because the Avere FXT cluster can serve multiple storage servers, there is no need to add Flash to each and every filer – the Avere cluster becomes an extensible fast media layer in front of all of them, serving up performance to hot spots without over provisioning.

Manageability is another hidden cost to upgrading an existing NAS infrastructure. With falling prices and improved durability making new storage media such as Flash SSDs widely available to boost application performance, many companies are tempted to install Flash at tier zero and expect that it will solve their performance problems, albeit it at a relatively high cost. But installing fast-access storage media solves only part of the problem. IT then has to figure out which applications are best served by the new tier of storage, often having to become an expert in the latest storage media read and write rates and application QoS schemes in order to optimize the utilization of the more costly storage media. In comparison, an Avere FXT cluster has the intelligence to dynamically allocate data to the appropriate storage tier and media based upon both data and access characteristics, which balances the cost/performance equation with no administrative overhead.

Follow

Get every new post delivered to your Inbox.