Automating MLOps, DevOps, and DataOps for Data Scientists

ONTAP AI

Mike McNamara

2020-06-24

automating-mlops-dataops-data-science-pipeline-1024x576

The unprecedented promise of machine learning (ML) is still unrealized, because data scientists are spending most of their time on non-data-science work. The common practice is that ML development through deployment relies on ad hoc tools, plug-ins, scripts, and a myriad of siloed tools that are impeding organizations, large and small, from streamlining ML development.

NetApp and cnvrg.io have partnered to deliver an AI/ML data science pipeline solution that is streamlined and drives productivity and efficiency. The solution incorporates industry-leading Kubernetes managed clusters (for example, Red Hat OpenShift), cached datasets for extreme performance, and the one-click attachments of models to datasets with NVIDIA NGC integration. NetApp® ONTAP® AI provides high-performance compute and storage for any scale of operation, and cnvrg.io software streamlines data science workflows, improving resource utilization.

With NetApp and cnvrg.io, you can cache datasets (and/or their versions) and make sure that they’re located in the ONTAP node attached to the GPU cluster or CPU cluster that is exercising the training. Once the datasets are cached, they can be used multiple times by different team members. With caching, datasets are ready to be used in seconds rather than hours, and cached datasets can be authorized and used by multiple teams in the same compute cluster connected to the NetApp cached data.

NetApp and cnvrg.io have written a detailed technical paper, Hybrid-cloud AI Operating System with Data Caching, which presents an innovative solution that enables IT professionals and data engineers to create a truly hybrid-cloud AI platform with a topology-aware data hub. Data scientists can instantly and automatically create a cache of their datasets in proximity to the compute, wherever the compute is located. As a result, high-performance model training can be easily accomplished and multiple AI practitioners can collaborate with immediate access to the cached datasets and versions, and with the ability to create a dataset-version hub.

To learn more, read the technical report. To experiment with cnvrg.io, download the free community version.

Mike McNamara

Mike McNamara는 NetApp의 제품 및 솔루션 마케팅 분야의 고위 경영진이며 25년이 넘는 데이터 관리 및 클라우드 스토리지 마케팅 경험을 보유하고 있습니다. 10년 전 NetApp에 입사하기에 앞서, McNamara는 Adaptec, Dell EMC, HPE에서 근무했습니다. McNamara는 자사 클라우드 스토리지 오퍼링 및 업계 최초의 클라우드 연결형 AI/ML 솔루션(NetApp), 유니파이드 스케일아웃 및 하이브리드 클라우드 스토리지 시스템 및 소프트웨어(NetApp), iSCSI 및 SAS 스토리지 시스템 및 소프트웨어(Adaptec), 파이버 채널 스토리지 시스템(EMC CLARiiON)의 출시를 이끈 핵심 팀 리더입니다.McNamara는 Fibre Channel Industry Association에서 마케팅 의장을 역임한 경력 외에도 Ethernet Technology Summit Conference Advisory Board와 Ethernet Alliance에서 회원으로 활동하고 있으며, 업계 저널의 고정 기고자로 활동하며 여러 행사에서 연설을 맡기도 했습니다. McNamara는 또한 FriesenPress에서 'Scale-Out Storage - The Next Frontier in Enterprise Data Management'라는 책을 출간했으며, Kapos가 선정한 눈 여겨 볼 상위 50대 B2B 제품 마케터에 이름을 올렸습니다.

Mike McNamara의 모든 게시물 보기

다음 단계

블로그

클라우드, 온프레미스, 그리고 그 사이의 모든 영역에서 최신 트렌드와 발전에 대한 정보를 얻으세요. 모든 것이 실전에 적용되고, 거기에 더해 완벽한 마무리까지!

블로그 글 읽기

커뮤니티

다양한 공개 포럼을 탐색하여 질문을 게시하고 답변을 공유하며 자신에게 가장 중요한 모든 NetApp 기술에 대한 지식을 쌓아보세요.

토론 참여하기