Automating MLOps, DevOps, and DataOps for Data Scientists

ONTAP AI

Mike McNamara

2020-06-24

automating-mlops-dataops-data-science-pipeline-1024x576

The unprecedented promise of machine learning (ML) is still unrealized, because data scientists are spending most of their time on non-data-science work. The common practice is that ML development through deployment relies on ad hoc tools, plug-ins, scripts, and a myriad of siloed tools that are impeding organizations, large and small, from streamlining ML development.

NetApp and cnvrg.io have partnered to deliver an AI/ML data science pipeline solution that is streamlined and drives productivity and efficiency. The solution incorporates industry-leading Kubernetes managed clusters (for example, Red Hat OpenShift), cached datasets for extreme performance, and the one-click attachments of models to datasets with NVIDIA NGC integration. NetApp® ONTAP® AI provides high-performance compute and storage for any scale of operation, and cnvrg.io software streamlines data science workflows, improving resource utilization.

With NetApp and cnvrg.io, you can cache datasets (and/or their versions) and make sure that they’re located in the ONTAP node attached to the GPU cluster or CPU cluster that is exercising the training. Once the datasets are cached, they can be used multiple times by different team members. With caching, datasets are ready to be used in seconds rather than hours, and cached datasets can be authorized and used by multiple teams in the same compute cluster connected to the NetApp cached data.

NetApp and cnvrg.io have written a detailed technical paper, Hybrid-cloud AI Operating System with Data Caching, which presents an innovative solution that enables IT professionals and data engineers to create a truly hybrid-cloud AI platform with a topology-aware data hub. Data scientists can instantly and automatically create a cache of their datasets in proximity to the compute, wherever the compute is located. As a result, high-performance model training can be easily accomplished and multiple AI practitioners can collaborate with immediate access to the cached datasets and versions, and with the ability to create a dataset-version hub.

To learn more, read the technical report. To experiment with cnvrg.io, download the free community version.

Mike McNamara

Mike McNamaraは、NetAppの製品およびソリューションマーケティング担当シニアリーダーであり、25年以上にわたってデータ管理とクラウドストレージマーケティングに携わってきました。10年以上前にNetAppに入社する前は、Adaptec、Dell EMC、HPEで勤務していました。また、主要なチームリーダーとして、ファーストパーティのクラウドストレージサービスや、業界初のクラウド対応AI/MLソリューション（NetApp）、ユニファイドスケールアウトおよびハイブリッドクラウドストレージシステムおよびソフトウェア（NetApp）、iSCSIおよびSASストレージシステムおよびソフトウェア（Adaptec）、ファイバチャネルストレージシステム（EMC CLARiX）の発売を推進しました。過去には、Fibre Channel Industry Associationのマーケティング分野の議長を務めたこともあり、Ethernet Technology Summitの諮問会議や、Ethernet Allianceの現役メンバーとして、業界誌に頻繁に寄稿しているほか、各種イベントにスピーカーとして数多く登壇しています。さらに、FriesenPressより『Scale-Out Storage - The Next Frontier in Enterprise Data Management』というタイトルの書籍を発行しているほか、KaposによりB2B製品マーケティング担当トップ50に選出されたこともあります。

Mike McNamaraのすべての投稿を見る

次のステップ

ブログ

クラウドからオンプレミスまで、様々なストレージやテクノロジの最新トレンドや開発状況に関する情報を入手しませんか？あらゆる分野の最新動向をご紹介します。

ブログを読む

コミュニティ

幅広いトピックスのオープンフォーラムで質問を投稿したり、回答を共有したりできます。また、お客様にとって最も関心の高い、NetAppのテクノロジの効果的な活用について知ることもできます。

ディスカッションに参加する