Avere’s recent SPECsfs2008 posting is the 2nd time in my career that I’ve had the privilege of tuning a high-performance storage system for a record-breaking SPEC SFS performance. Prior to joining Avere Systems, I was part of the Panasas engineering team that broke the SPEC SFS97 world-record back in 2003. For many of the folks at Avere, this is their 3rd run at record-breaking SPEC SFS97 performance as they built the Spinnaker SpinServers (now NetApp ONTAP GX) with record-breaking performance in 2002 and again in 2006 with a record-breaking run using NetApp ONTAP GX. I’ve been building high-performance storage systems for 12 years now, and for 8 of those years SPEC SFS has been a major performance consideration for those systems.
SPEC is an important benchmark because it:
1. Provides a standard basis for comparison of systems from different vendors.
2. Models a workload derived from real customers workloads and exercises all important NFS operation types.
3. Includes a forum for disclosing results in a consistent fashion that is followed by the entire NAS industry.
So what was different this time around?
The obvious difference is the benchmark itself. SPEC SFS97 and SPECsfs2008 are different in many ways, but the most striking difference from an engineering perspective is the increase in the data set size and the working set size. SFS97 only created 10 MB of data per SFS operation/sec. SPECsfs2008 creates 120 MB of data per op/sec so the data set has grown by a factor of 12x. To make matters worse, the percentage of data accessed has increased from 10% in SFS97 to 30% in SFS08. This results in a whopping 36x increase in the amount of data accessed over the course of an SFS run. What this means is that the benchmark is no longer about caching the entire working set. Instead, you have to efficiently schedule disk IO operations and cache the most important parts of the working set.
To make matters worse, not only does SPECsfs2008 present a 36x larger working set, but it still only had 5 minutes for “warm-up” and 5 minutes for the “run” phase (unchanged from SPEC SFS97). The FXT series has a number of algorithms that identify access patterns, but 5 minutes doesn’t give much time to recognize patterns and make adjustments. Mistakes are very costly, so our algorithms need to be accurate in order to take advantage of the very little time SPEC gives us to analyze the demand being placed our system.
Out of all of the changes between SPEC SFS97 and SPECsfs08, the working set size change had the most impact on tuning our systems. The good news is that we were able to cope with this large working set while delivering very high performance. The results show that we do this quite well. Given the challenges that SPECsfs2008 presents, the benchmark is a rite of passage for a high-performance NAS system, just as SPEC SFS97 was during the prior 10 years. The FXT series is a better performing product because we excel at SPECsfs2008, and I think you’ll find that we excel at many other workloads as a result.
SPEC® and the benchmark name SPECsfs®2008 are registered trademarks of the Standard Performance Evaluation Corporation. Competitive benchmark results stated above reflect results published on www.spec.org as of Oct 14, 2009. For the latest SPECsfs2008 benchmark results, visit http://www.spec.org/sfs2008.

One Trackback
[...] Latency and Why it Matters Low latency is near and dear to our hearts at Avere Systems. In my last blog post, I talked about our record-breaking SPECsfs2008 performance results in general. In this post, I’m [...]