| Quick Links |
| netapp.com |
| Tech OnTap Archive |
| Feburary 2009 (PDF) |
The Ultimate in Storage Performance
|
||||||||||||||||||||||||||||||||||||||||
| V3170/RamSan-500 | V3170/FC Drives | |
| Capacity | Equal (2TB) | Equal (2TB) |
| IOPS | 50,000 | 3,600 |
| Latency |
1ms | 10ms |
| Power | 1,100W | 1,150W |
| Rack Space | 10U | 9U |
The two solutions provide similar capacity but markedly different IOPS and latency performance. In the disk solution, the number of spindles directly dictates I/O performance, as measured in IOPS.
The second example compares the V-Series/RamSan with a V-Series/disk configuration that is capable of sustaining the same number of IOPS.
Table 2) Comparison of V3170/RamSan and V3170/15K RPM FC disks (equivalent throughput in IOPS).
| V3170/RamSan-500 | V3170/FC Drives | |
| IOPS | 50,000 | 50,000 |
| Capacity | 2TB | 27TB |
| Latency |
1ms | 10ms |
| Power | 1,100W | 5,900W |
| Rack Space | 10U | 48U |
In the first comparison, a disk configuration with similar capacity delivered only about 7% of the throughput (in IOPS), while latency was an order of magnitude slower. Table 2 illustrates that it takes 27TB of disk capacity (14 disk shelves) to deliver the same number of IOPS, but the large number of disks does nothing to improve latency due to the mechanical limitations inherent in disk drives. This is a good representation of the situation with many high-performance database applications. A large number of spindles (with correspondingly high space, energy, and cooling needs) must be deployed to provide the necessary I/O throughput—even when the capacity is not necessary.
The message is that for applications in which throughput and/or latency are the limiting factors, the RamSan offers clear advantages. Later in the article, we’ll discuss how to tell if your application is I/O bound.
You’re probably already starting to guess which applications are likely to benefit from the reduced latency and impressive throughput that the VSeries/RamSan combination can provide.
In general, applications with random I/O workloads benefit from the use of flash memory devices, while sequential workloads do not. Online Transaction Processing (OLTP) and other database applications are fairly obvious candidates. If an entire database fits within RamSan storage capacity, it will immediately benefit from the solution’s latency and throughput. For larger databases, you can leverage the improved latency of RamSan by storing just your hot files—such as redo logs, indexes, and temp space—on the RamSan and putting the rest of the database on hard disk drives.
Generally, any application where the entire data set, or the active data set, fits in RamSan memory may be a good candidate, especially when that data set is accessed by multiple servers working in parallel. The render farms used in computer animation are a good example. The key, of course, is whether or not disk I/O is the limiting factor in application performance.
When it comes to improving application performance, storage is often the last link in the performance chain to be investigated.

Figure 1) Typical approaches to improve application performance.
This is in part because the methods for understanding I/O performance are not that widely understood. However, a number of OS-level, storage-system-level, and application-level I/O performance analysis tools are available to help you investigate possible I/O problems.
Operating System Level Tools for Investigating I/O
For UNIX® and Linux® operating systems, a number of common utilities such as top, iostat, and sar (system activity reporter) can help you understand the potential impact of I/O on your server. If the server is (or can be) dedicated to the single application of interest, these statistics can be useful. For instance, the iostat command on a Linux system shows “%iowait” the percentage of time that the system spends waiting for I/O. (One caveat with this command is that it presents a single point-in-time view.)
For Microsoft® Windows®, the best tool for system performance analysis is Performance Monitor. Unfortunately, Performance Monitor does not provide explicit I/O Wait Time statistics. It does, however, include real-time processor performance levels and disk queue statistics. “Processor: % Processor Time” measures the actual work being done by the processor, and “Avg Disk Queue Length” shows the number of in- process I/O operations. If a system hit hard by transactions shows significant disk queue levels and yet “% Processor Time” is well under 100%, you can assume that server I/O wait time is high.
Storage System Tools
If you are using an intelligent back-end storage system, it may provide additional information about I/O. For instance, by using NetApp Operations Manager, you can graphically review a variety of storage metrics, including volume latencies, operations per second, and so on. By focusing specifically on the volumes used by a particular application, you can tell whether those volumes are experiencing excessive transaction rates and/or high latencies.
You can find out more about using these NetApp tools in an earlier Tech OnTap article, “Monitor, Troubleshoot, and Improve NetApp Storage Performance” and in NetApp Technical Report 3525. “Storage Performance Management.”
Application Tools
For the greatest possible application specificity, you need I/O instrumentation embedded in your application to tell you exactly how the application is spending its time. Many popular database and business applications contain this type of instrumentation. Oracle, for instance, comes with the Statspack utility to monitor database performance. In Oracle10g™, Oracle introduced the AWR (Automatic Workload Repository), along with ADDM (Automatic Database Diagnostic Monitor) as an extra-cost option to their Enterprise Manager Tool.
The Statspack report contains a “Top 5 Timed Events” section that is the first place you would look to begin to understand whether or not your database is I/O bound Figure 2).

Figure 2) Part of the Oracle Statspack report showing Top 5 Timed Events (15 minute interval).
Reviewing the example output, it is immediately obvious that during this interval the database is spending 83% of total elapsed time reading. Dividing the total number of waits into the time spent in seconds shows an average latency of 5.25ms per wait. Although this latency is respectable, the only way to get significant performance improvement out of this database would be to reduce it further. The V-Series/RamSan combination is an excellent solution for that purpose. You can find out more about Oracle® I/O performance in a recent white paper from Texas Memory Systems, “Faster Oracle Performance with Solid State Disks.”As with any large storage deployment, a NetApp V3170 combined with TMS RamSan is a substantial investment, especially if you dedicate that investment to accelerate a single application. An alternative approach for NFS-based environments is to deploy the solution by using NetApp FlexCache software to create a caching architecture.
With a caching architecture, a fast storage cache resides logically between your primary storage systems and your compute servers or desktop clients. Data is automatically copied into this caching tier the first time it is read; subsequent reads are satisfied from the cache rather than the origin storage system. By investing in a centralized caching tier with NetApp V-Series and TMS RamSan, you can leverage that investment across multiple storage systems and applications. You can also deploy more economical, high- capacity disk drives in your primary storage without adversely affecting performance. Because hot data is automatically cached, manual data management or migration software is not needed.

Figure 3) A FlexCache architecture showing the V3170/RamSan in use as a high-performance caching tier.
This approach has demonstrated significant benefits for accelerating the speed of parallel software builds as well as compute-intensive applications such as animation rendering, electronic design automation, seismic analysis, and financial market simulation. Any application in which the same data is read in parallel by multiple compute servers can benefit from such an approach.
If you have an application that is I/O bound or where disk latency has become the limiting factor, the combination of the NetApp V3170 and the TMS RamSan-500 can reduce latency by an order of magnitude (from 10 milliseconds down to 1 millisecond), while delivering up to 50,000 IOPS. This exceptional performance makes the solution highly cost effective for critical applications, and enterprise-class data management features make it an easy fit in your existing environment.
|
Got opinions about V-Series with TMS RamSan? |