NetApp and Red Hat Collaborate for pNFS
Access to shared data is critical to the performance of a variety of scientific, engineering, business, and financial applications. NFS—the most widely used standard for shared data access—can become a bottleneck for large-scale compute clusters, which can overwhelm file servers that are the single point of access for all files in a shared file system.
Most of the solutions available to provide high-performance and shared data access have been more or less proprietary, and they have failed to gain the kind of heterogeneous system support and widespread adoption that standard protocols such as NFS have achieved.
The parallel NFS (pNFS) standard—a subfeature of the NFS version 4.1 protocol specification (RFC 5661)—addresses the single server bottleneck and has great promise to become a standardized solution for parallel data access. In this article we explain how pNFS works, talk about efforts that NetApp and Red Hat are making to move pNFS forward, and describe how pNFS is implemented in clustered Data ONTAP.
What Is pNFS?
The pNFS protocol gives clients direct access to files striped across two or more data servers. By accessing several data servers in parallel, clients achieve significant I/O acceleration. The pNFS protocol has been designed to deliver graceful performance scaling on both a per-client and per-file basis, without sacrificing backward compatibility with the standard NFS protocol. Clients without the pNFS extension are still able to access data.
pNFS Architecture and Core Protocols
The pNFS architecture consists of three main components.
There are three types of protocols used between the clients, metadata server, and data servers.
Figure 1) Elements of pNFS. Clients request layout from metadata server (pNFS protocol) and then access data servers directly (storage access protocol).
To access a file, a client contacts the metadata server to open the file and request the file’s layout. Once the client receives the file layout, it uses that information to perform I/O directly to and from the data servers in parallel, using the appropriate storage access protocol without further involving the metadata server. pNFS clients cache the layout until they are done with the parallel I/O operations. pNFS servers have the right to revoke the layout of the file if the server cannot promise parallel access to the servers. Further, pNFS does not modify the current mechanism available in the NFS server for metadata access.
Red Hat and Linux Team for pNFS
A pNFS solution needs both client and server components to function. NetApp and Red Hat have been working together on industry-standard pNFS to develop solutions that solve the problems associated with shared storage.
NetApp addresses the challenges of scale by combining storage clustering and pNFS. NetApp® FAS and V-Series storage running clustered Data ONTAP 8.1 or later can scale from just a few terabytes of data to over 69 petabytes of data, all of which can be managed as a single storage entity. This simplifies management of a pNFS environment and helps eliminate both planned and unplanned downtime.
Red Hat has delivered the first pNFS client that takes advantage of the scalability, flexibility, simplified management, and optimized data paths of pNFS. By providing pNFS in Red Hat Enterprise Linux®, Red Hat enables application workloads to take full advantage of pNFS without modification, allowing a seamless transition for existing applications.
The combination of NetApp storage and Red Hat Enterprise Linux is a first-to-market pNFS solution.
pNFS and Clustered Data ONTAP
NetApp implemented pNFS starting in clustered Data ONTAP 8.1. (There is no 7-Mode or Data ONTAP 7G implementation.) pNFS implemented in clustered Data ONTAP offers a number of advantages.
Figure 2) pNFS on Data ONTAP versus NFS. Every node can serve as both a metadata server and a data server.
To understand how pNFS works with clustered Data ONTAP, suppose a client mounted a pNFS file system from one node in a cluster. To access a file, it sends a metadata request to that node. The pNFS implementation gathers and returns information that includes location, layout of the file, and the network information needed to reach the location. The client uses the information to access the data directly from the node or nodes where it resides. By providing a direct path to the volume, pNFS helps applications achieve higher throughput and lower latency.
pNFS seamlessly integrates with clustered Data ONTAP nondisruptive operations such as LIF migrate, storage failover, and volume move. If one of these operations occurs, the pNFS client and server automatically negotiate the new direct I/O path to the server, which helps keep throughput the same, all without any disruption to the application. This is a huge benefit for storage administrators because they do not have to explicitly provision network paths to file systems when they carry out maintenance operations on the cluster. Thus, pNFS with clustered Data ONTAP not only helps with performance, but also simplifies administrative workflows during maintenance operations. As you provision and deploy large-sized clusters, this becomes a necessity.
Figure 3) Without pNFS, both metadata and data paths are more or less static. With pNFS, metadata service is distributed across numerous nodes while data paths are direct to the network interface of the node storing the file. When data moves, data paths adapt automatically to maintain optimal performance.
Attention to a few best practices will help deliver the best pNFS performance.
For more information on deploying pNFS on NetApp storage, refer to TR-4063.
Red Hat pNFS Client
The Red Hat pNFS client was first released in the Red Hat Enterprise Linux (RHEL) version 6.2 kernel in 2011. RHEL6.2 and RHEL6.3 were both labeled as “tech preview” versions of pNFS. RHEL6.4, released in February 2013, is the first general-availability version of pNFS. You can find complete details on using Red Hat clients with NetApp storage running either NFS or pNFS in TR-3183.
pNFS Use Cases
In addition to its obvious applicability for highly parallel science and engineering applications, the unique capabilities of pNFS make it a good fit for a variety of enterprise use cases.
Business-critical applications by definition require the highest service levels. Storage bandwidth and capacity must grow seamlessly with server requirements. As NetApp storage volumes are transparently migrated to more powerful controllers in the NetApp cluster, the Red Hat Enterprise Linux pNFS client automatically follows the data movement, self-adjusts, and reoptimizes the data path. The net result is near-zero downtime with no server or application reconfiguration required.
Multi-Tenant Storage Solutions
Having parallel data access means that multi-tenant, heterogeneous workloads benefit directly from pNFS. The data resides on the NetApp cluster and is not tied to a specific NetApp controller. With pNFS, the Red Hat Enterprise Linux servers find the optimal data path and automatically adjust for optimum throughput.
Mixed Clients and Workloads
NFSv4.1 and pNFS can provide flexibility for mounting the file system from anywhere in the cluster namespace. Clustered applications can be mounted over pNFS while legacy applications can still be mounted over NFSv3. File systems that are exported from storage can have clients mounted over different flavors of NFS so that they can coexist without making any significant changes to the applications that access the data. This level of flexibility reduces the overhead of frequent change management.
Hypervisors and virtual machines utilizing the Red Hat Enterprise Linux pNFS client are able to maintain numerous connections per session, which spreads the load across several network interfaces. Think of it as multipathing for NFS, without requiring a separate multipath driver or configuration.
NetApp has been a major driver of both NFSv4.1 and pNFS, cochairing the efforts of the working group. In addition, NetApp has authored and edited a significant portion of the NFSv4.1 specification. This is consistent with our commitment to tackle the problems of storage using industry standards.
With the recent general availability of the pNFS client with the release of RHEL 6.4, you can now deploy pNFS for testing and/or production using a combination of Red Hat clients and NetApp clustered Data ONTAP.
Got opinions about pNFS?
Ask questions, exchange ideas, and share your thoughts online in NetApp Communities.
Visit Tech OnTap in the NetApp Community to subscribe today.