BlueXP is now NetApp Console
Monitor and run hybrid cloud data services
So it's no secret that enterprises all over the world are currently racing to adopt AI to transform their businesses. Uh but there are a lot of challenges that these enterprises face as they adopt AI.tooling is complicated. It's hard to adopt. It's hard to manage. There are lots of open-source tools, lots of disperate different open- source tools that have to work together. they're not always enterprise ready and so it can be super challenging for enterprise IT departments to support AI initiatives. Uh I'm Mike Oglesby. I'm a technical marketing engineer focused on AI and machine learning solutions here at NetApp. Uh and in this session, I'm going to highlight a solution that we've put together with our partners at NVIDIA and VMware uh to address this challenge headon and make it easier for enterprises to adopt AI. Uh so without further ado, let's jump right in. Uh first I just need to mention that the information in this presentation is confidential and proprietary to NetApp. Um, I'll let you I'll give you a few seconds here to read the rest of this confidentiality notice. All right. So, I mentioned that managing AI infrastructure, AI tooling is challenging. It's often an extremely daunting task for enterprise IT departments. uh and Nvidia has put together a software suite called Nvidia AI Enterprise that's specifically uh specifically designed to address this challenge headon. Uh and when you pair Nvidia AI Enterprise with NetApp and VMware, you have a enterprise caliber AI infrastructure and software stack from bottom to top. So we've partnered with Nvidia and VMware to deliver a streamlined, super simple, easy to adopt enterprise class AI architecture. Uh and so I'm going to start off by giving just a quick overview of NVIDIA AI enterprise. Uh then I'm going to dive into a specific piece of NVIDIA AI Enterprise, which is the AI ready enterprise platform for VMware. uh and I'm going to talk about how NetApp fits in here and the solution that we've jointly put together. Uh then I'm going to highlight asimple use case that's relevant to lots of organizations as they first dive into the AI. Uh and then I'm going to show a demo of that use case in action. Uh and lastly, we'll wrap up with some resources. All right. So, what is Nvidia AI Enterprise? Well, at a high level, uh it's an end toend software platform for production ready AI. Uh so we all know that accelerated computing uh specifically the accelerated computing that Nvidia offers uh can greatly increase productivity while greatly lowering total cost of ownership. Uh but there's a need for enterprisegrade security, stability, manageability, and support uh when adopting this accelerated computing, especially with all of the AI, machine, and machine learning anddata science tooling thatcomes along with it. Uh and so that's what NVIDIA AI Enterprise is. It's essentially a software suite that offers all of thetools that data scientists and data engineers might need uh in a suite that's certified and supported by NVIDIA for enterprise environments. So it'sa cloudnative software suite. It's and it's certified to run just about anywhere. It's you know you can run NVIDIA AI enterprise in any of the major public clouds. Uh you can run Nvidia AI Enterprise on trusted virtualized data center platforms like VMware or even KVM. Uh and if you really want to turbocharge your AI initiatives, you can run NVIDIA AI enterprise software on Nvidia DGX systems. Uh and NVIDIA AI enterprise has actually now added support for generative AI. So everyone is racing to adopt generative AI and uh Nvidia has recently added the Nemo Megatron offering to Nvidia AI enterprise and so now there's a true enterprise offering for generative AI aspart of this suite. Uh and Nvidia's actually partnered with VMware to put together a solution called VMware private AI foundation with NVIDIA. uh that's based on the uh AI ready infrastructure platform for VMware that I'm going to highlight uh in a couple of slides. Uh but before we go there um if Nvidia AI enterprise is the enterprise software suite for AI, well AI requires data and so we need an enterprise data platform for AI. uh and oursoftware suite is hybrid multicloud capable and so we need a data platform that's also hybrid multicloud capable and NetApp is uniquely positioned to fill that role with our firstparty storage services in all of the major clouds uh and with our suite of hybrid multi cloud data service and data management offerings and capabilities uh and an enterprise data platform for AI needs enterprise grade security, stability, support. Uh, NetApp fills that role. It's trusted by millions. Uh, you know, you get all of thetrusted data protection, data governance, and ransomware protection capabilities thatNetApp brings to the table. Uh, and with NetApp, you also get enterprise caliber manageability. So you know we you can use blue XP as a unified control plane for orchestration to manage your entire data estate. Uh you can use REST APIs for to build automation. You can use various toolkits Python toolkits anible toolkits uh to drive automation and self-service. And we have native integrations with platforms like VMware and like Kubernetes. uh and we've also validated integrations at a higher level with some enterprise MLOps platforms that are also certified as part of NVIDIA AI enterprise uh like Domino Data Lab. Uh and so NetApp is really uniquely positioned to be the enterprise data platform for AI to go along with NVIDIA AI enterprise uh which is the enterprise software platform for AI. And so I I'm going to go ahead and transition now to a specific piece of NVIDIA AI Enterprise that makes it super easy for enterprises to adopt uh this entire stack and that's the AI ready enterprise platform for VMware. I mean th this is a solution that's basically designed to bring AI to any organization. Uh so you've got NVIDIA AI enterprise on at the top of the stack. That's as I already mentioned, it's an end toend cloudnative AI and data analytics software suite that's certified and supported by Nvidia. Uh and then uh as part of this platform that software is running on the familiar and battle tested VMware vSphere platform. So this is a platform obviously that enterprises all over the world already trust and uh you know that platform is being accelerated with Nvidia's high performance VGPU technology. Uh and it's all certified and all supported. uh when you pair that with NetApp uh you get simple data science workspace orchestration with the NetApp data ops toolkit you know it and so that enables user selfservice for the data science teams uh and you get all of those proven trusted NetApp data management and data protection capabilities you know as I'm sure you all know NetApp has a longhistory as is one of the leading uh storage partners forVMware is one of the leading storage platforms for VMware deployments. Uh and so this is this entire platform is enterprise ready. It's trusted. It's easy to adopt. You know most organizations are already running um maybe you know every component of it or at least theNetApp and VMware components of it and it includes global enterprise support. So Nvidia offers uh global enterprise support for NVIDIA AI enterprise. Uh VMware obviously offers global enterprise support. And then you've got NetApp with ouruh enterprise support on the uh as the data platform that underpins at all. So there's no gap in support in the stack. It's a truly enterprise caliber stack from bottom to top. And so what are the benefits of this architecture of this solution? Well, for the data scientists, they get access to all of their standard libraries, SDKs, all of the tooling that they're already used to working with. So frameworks like PyTorch, TensorFlow, PyTorch, TensorFlow, PyTorch, TensorFlow, uh I already mentioned Nvidia, Nemo, Megatron for generative AI, but these are standard libraries and SDKs, a familiar tooling, but in this instance, they have been vetted, certified, optimized,and validated by Nvidia as part of the NVIDIA AI enterprise software suite. So it's a vetted and enterprise caliber version of all of these specific tools[cough] for IT admins. They get a familiar proven platform and architecture. So you know they already know and trust VMware vSphere. They already know and trust NetApp as the data platform underneath. And so it's a easy architecture for them to support uh much easier than a bespoke, you know, open-source-based AIuh, you know, architecture. Risk managers, they get NetApp's built-in data protection, data management, disaster recovery capabilities that they already know and trust. Uh, and so there's thereare no concerns there. This is a platform they already use, they already know, they already trust. Uh, and IT leadership, they get a platform that's from bottom to top, an architecture that's from bottom to top, hybrid, multicloud capable. So, uh, the software stack, Nvidia AI Enterprise is hybrid multicloud capable. Uh, and then the virtualization platform and the data platform are also hybrid multicloud capable. Uh weat NetApp have a large menu of hybrid cloud solutions that we've put togetherwith our partners at VMware. And so there's a lot of flexibility andwhere thisuh architecture uh architecture can be adopted. And so what sort of use cases, you know, what sort ofworkloads does this architecture support? Uh well, it supports a ton of different workloads, right? We touched on that a little bit when we when uh I covered NVIDIA AI Enterprise.But a common use case thatenterprises that are first diving into AI encounter is just a simple [cough][snorts]is just a simple uh AI training job, right? they take they you know they often start with a PC they take some data uh and using something like PyTorch or TensorFlow or you know another framework they'll experiment with running a trainingjob on that data they'll train a model they'll test that model they'll look at the accuracy of that model uh and then they'llmove forward from there is how almost all uh data science projects start and how um you most what most enterprises do when they first dive into AI, the use case that they encounter when they first start to adopt AI. And this is a workflow that is super simple to execute with this uh solution. It'sa workflow that a data scientist could be up and running uh that a data scientist could have up and running in justmere minutes. Uh and at NetApp, weeat our own cooking, if you will, when it comes to this solution and this use case. Uh we have an offering within ourNetApp lab on demand uh where our sales engineers and partner sales engineers can uh spin up alab where they get access to the NVIDIA enterprise AI tooling. uh and so it gives them an environment where they can familiarize themselves with the NVIDIA tooling and with AI workflows and with how each layer of thestack fits together there. And so this is uh you know a an engineer can log into the lab on demand site, book a lab uh and behind the scenes we've got some automation that clones a template uh and that engineer is up and running with a uh VMware based virtualized lab uh in just mere minutes where they have access to uh vGPUs and the NVIDIA enterprise AI tooling paired withNetApp data management capabilities and storage underpinning it all. Uh and so uh on that note, that's enough slideware. Let's go ahead and jump into a demo that shows this use case in action. Uh so let's go ahead and roll the demo. Hi, I'm Mike Oglesby and I'm a technical marketing engineer focused on AI solutions here at NetApp. Uh, in this demo video, I'm going to show you a VGPU accelerated TensorFlow training job powered by Nvidia AI Enterprise Suite running on VMware vSphere with NetApp. So, let's get started. The first thing I'm going to do is clone a VGPU guest virtual machine template in order to create a new virtual machine in which I can execute my TensorFlow training job using Nvidia AI Enterprise software. So I'm going to go through all the normal steps that I would go through to clone any VM template. I'm going to select all my options and click through the wizard. And then when I get to the customize hardware screen, I can choose the Nvidia VGPU profile that I want to allocate to this new VM. And I'm going to choose the A100D16C vGPU profile. So I click on finish. And now vSphere is going to kick off a job to create a new virtual machine from my template.A few minutes later, I've got a brand new virtual machine called mike-workspace.So I'm going to retrieve the IP address from vSphere. Then I'm going to jump over to my terminal and I'm going to SSH over to this new virtual machine. So I enter my password here. And now I'm in the terminal on my new vGPU guest VM. And so the first thing I'm going to do is I'm going to use the NetApp data ops toolkit to run a list volumes command. And this is going to tell me all of the data volumes that I have available to mount within this workspace. And I'm going to go ahead and mount the imageet volume. So I run this mount volume command. Specify the volume name and the mount point that I want to mount it to. And the reason I'm mounting this volume is it contains the data set that I'm going to run my training job against. Now to run my training job, I launch a Nvidia Enterprise TensorFlow container from Nvidia NGC. So I've launched that and now [snorts] that I'm in that container, I can just go ahead and kick off my job. And all I need to do to run it against that specific data set is specify the mount path. And that's it. It's that simple to run a VGPU accelerated TensorFlow training job on NVIDIA AI Enterprise with Vsspere and NetApp. All right, I hope you enjoyed the demo. Uh, and so just real quickly before we wrap up, I'd like to reiterate that NVIDIA AI Enterprise is a suite of cuttingedge AI software, but it's been validated, optimized, and certified by Nvidia for the enterprise. Uh, and when you pair this with VMware and NetApp, you get a familiar proven platform and architecture from bottom to top. And so IT leadership can greatly reduce risk as they adopt AI by building on top of a familiar and proven stack. Uh and then lastly, the whole stack from bottom to top is hybrid multicloud capable. You got NetApp's hybrid multicloud data management capabilities uh paired with uh VMware's hybrid multicloud capabilities and the hybrid multicloud support of NVIDIA AI enterpriseuh that enables enterprises to extend this platform across environments across data centers and across clouds. Uh so real quick uh there's some other sessions related to this one that you might want to check out. you might want to go out and look for these recordings. Uh, also in the bottom right of the slide here, there are some links that you'll definitely want to check out. The first link isa link to our technical documentation for the solution that I've highlighted here. So, definitely check that one out. Uh, the next link is to Nvidia's AI Enterprise product page where, you know, if you want to dive deeper into Nvidia AI Enterprise, that's the place to go. Uh and then lastly, we've got NetApp's uh AI solution landing page. So if you want to see everything NetApp has to offer in the AI space, that's the place to go. It's all there.Uh please do stay connected. Follow us on Facebook and Twitter. Join the NetApp community on Discord. Check out NetApp TV. Um you're already on NetApp TV if you're watching thissession. And so check out everything we have to offer there. If you'd like to connect with me, my the link to my LinkedIn profile is at the bottom here. So feel free toconnect with me on LinkedIn. Feel free to reach out if you have any questions or if you want to start a conversation. And with that, I thank you for your time. I hope you enjoyed the session.
NVIDIA AI Enterprise, powered by NetApp and VMware, was created to deliver a streamlined, enterprise-class AI architecture. See the execution of a TensorFlow training job on this enterprise-caliber AI platform.