FlexPod AI Industrializes Your AI/ML Workloads

Flexpod

Bruno Messina

March 15, 2019

On December 1, 1913, Henry Ford installed the first moving assembly line for the mass production of an entire automobile. His innovation reduced the time it took to build a car from more than 12 hours to 2 hours and 30 minutes. After the Ford assembly line was recognized as a much better mode of production, it spread quickly—practically overnight, and not only to the automobile industry, but to almost all lines of manufacturing. Assembly lines are still in use today with more automation and safety built into the many steps of modern manufacturing. It is hard to even conceive of a modern appliance, iPhone, automobile, or washing machine that isn't made on an automated assembly line.

The new FlexPod^® AI platform provides an innovation like that of an assembly line. FlexPod AI automates nearly all aspects of rolling out an artificial intelligence (AI) workload. FlexPod AI provides UCS service profiles, which are software objects that represent your UCS servers. Through the UCS Manager, these objects are instantiated on your FlexPod UCS C480 ML M5 servers for deployment. Because the service profile is software, your FlexPod AI server is provisioned in an automated fashion. If more than one server is needed (automobiles of different colors), then different attributes (MAC addresses, for example) are altered. But the main service profile template remains largely the same, and another FlexPod AI server is provisioned exactly with minimal extra effort. Automation, consistency, decrease in errors, like an assembly line, is achieved on a large scale.

Assembly lines also need more than automation. Assembly lines need parts—many kinds of parts in large quantities in the right place, at the right time. In fact, a modern Toyota uses about 30,000 parts. Running out of parts stalls an assembly line. In FlexPod AI, the “parts” are data, and the conveyor belt of the assembly line is the NetApp^® Data Fabric.

These on-prem FlexPod AIs also provide secure separation of resources and achieve data integrity for the AI/ML environment, a dedicated Storage Virtual Machine (SVM) was created for an AI tenant. With the massive scalability created with Cisco Nexus 9000 Series Switches and ONTAP software, you can deploy environments that scale to 20 PB and beyond in a single namespace to support very AI large data sets, resulting in better data models. The massive AI data challenges are possible with ONTAP FlexGroup volumes, a storage administrator can easily provision a massive single namespace in a matter of seconds with an SVM. FlexGroup volumes have virtually no capacity or file count constraints outside of the physical limits of hardware or the total volume limits of ONTAP. Limits are determined by the overall number of constituent member volumes that work in collaboration to dynamically balance load and space allocation evenly across all members. Multiple SVMs each with their own FlexGroup volume can be deployed to support a number of AI tenants on the same FlexPod infrastructure. With FlexPod AI, you likely won't run of parts/space or stall your assembly line.

The foundation of the NetApp Data Fabric is ONTAP^® data management software. As part of the Data Fabric effort, NetApp developed a special cloud version: NetApp Cloud Volumes ONTAP. Cloud Volumes ONTAP creates a virtual NetApp data system within enterprise-level public cloud environments that partner with NetApp—Microsoft Azure, Amazon Web Services (AWS), Google Cloud Platform, and so on. This virtual platform allows data to be stored the same way it is stored on in-house NetApp systems. The continuity in ONTAP data services enables administrators to move data where and when it is needed without requiring conversions. In turn, this capability allows the public cloud provider partner to act as an extension of the enterprise data center. The Data Fabric vision is an extraordinary opus—in fact, announced in 2014, it has been 5 years in the making. It has taken many, many engineers years of effort to bring the vision to reality. It would be nearly impossible for a smaller company starting today to replicate it in any practical amount of time. NetApp has also moved to the edge with its ONTAP Select software defined storage. Your data and data management strategy is consistent from the edge to the core to the cloud.

If the right data isn’t in the right place, it can’t be used. This concept is like needing to have the correct parts ready for the assembly line. Not having the Data Fabric and the data pipeline is like running out of parts or not being able to source the correct parts in the future.

https://www.youtube.com/watch?v=jLud5XYfY_c

Moving data and keeping data source options open (edge, core, cloud) are crucial to AI and its nearly insatiable appetite for data. A strategic AI platform choice really needs data in the right place at the right time. It needs to be able to move that data reliably with ONTAP data management software and services that have been proven over decades. Only NetApp delivers the data with the NetApp Data Fabric.

FlexPod AI also brings other benefits to AI/ML workloads:

A platform of innovation with performance—ultimate performance—built on all flash that scales to hundreds of petabytes in a single pool, the fastest Intel CPUs, NVIDIA Tesla V100 GPUs, 32Gb FC, 100GbE, RoCE, NetApp MAX Data, and end-to-end NVMe
An infrastructure of versatility and flexibility that addresses the ever-changing world of dynamic, diverse, distributed data
An edge-to-core-to-cloud data pipeline to collect, clean, correlate, train, and model workloads
A solution that is trusted worldwide and delivers the best experience—enterprise-class support, manageability, and reliability

Cisco is voicing their support as are partners on FlexPod AI:

"The "FlexPod Datacentre for AI/ML" provides an AI Infrastructure with predictable performance at-scale for faster and simpler deployment. It removes the challenges usually encountered in AI/DL/ML infrastructure design by providing an integrated platform across Compute, Storage and Network resulting in a lower Total Cost of Ownership (TCO).” - Dr. Werner Scholz, CTO, XENON Systems

“As one the first North American FlexPod Premium Partner, Long View is thrilled to see this FlexPod AI innovation from NetApp, Cisco and NVIDIA which addresses the growing demand by our customers to analyze their data to generate real time business insights, decisions and outcomes.” - Kent MacDonald, Senior Vice President, Strategic Alliances , Long View Systems

"The FlexPod Datacenter AI/ML CVD is a key supporting cornerstone to our strategy. FlexPod has long been a strategic solution for Insight Cloud and Data Center Transformation (formerly known as Datalink). As we make strategic investments into AI including our expertise in Digital Innovation and as an NVIDIA Deep Learning partner, we clearly understand the importance of building out reliable and scalableFlexPod AI infrastructure solutions." - Kent Christensen, Practice Director Cloud and Data Center Transformation, Insight

FlexPod AI is a converged infrastructure solution versatile enough to run AI/ML and nearly any data center workload. The best predictor of FlexPod AI success is the many happy FlexPod customers. Worldwide FlexPod has thousands of happy customers running mission-critical applications. Please take a look at these videos to get a sense of FlexPod running mission-critical use cases.

FlexPod AI solves the two key problems in AI infrastructure: provisioning servers automatically and providing data in a futuristic, open, and automated manner. Henry Ford once said, “motor car[s] for the great multitude.” (“When I’m through,” he said, “about everybody will have one.”) FlexPod can provide the same benefit to AI workloads: “AI to the great multitude.” If you find yourself at NVIDIA's GPU Technology Conference (GTC) at the NetApp booth 917, or surfing the net for an AI solution, please “kick the tires” on FlexPod AI.

For more information, please visit:

Bruno Messina

Bruno Messina joined NetApp in 2018 and works in product marketing for FlexPod. His previous experience includes a career in product marketing of UCS servers for Cisco Systems and Solaris server marketing and competitive analysis at both Oracle and Sun Microsystems, where he joined in 2000. Bruno spent ten years in various roles of competitive analysis and product management at Sun Microsystems, leading analysis in both the workgroup and enterprise servers. Prior to Sun Microsystems, Bruno spent time finishing his MBA education and worked for two years at Cadence working on product marketing for both board-level and board timing tools. Bruno holds both a BSEE and MBA from Rensselaer Polytechnic Institute in Troy, N.Y.

View all Posts by Bruno Messina

Next Steps

Blogs

Brush up on the latest trends and developments in cloud, on premises, and everywhere in between. This is where it all gets real, with a cherry on top.

Get to reading

Community

Explore a wide range of open forums where you can post questions, share answers and just generally get smart on all the NetApp technologies that matter most to you.

Join the discussion