Boost performance with NVIDIA Magnum IO GPUDirect Storage

Table of Contents

이 페이지 공유하기

Mike McNamara

2021-06-28

511 조회수

NVIDIA Magnum IO GPUDirect Storage (MIO GDS) enables a direct data path for direct memory access (DMA) transfers between GPU memory and storage, which avoids a bounce buffer through the CPU. The direct path increases system bandwidth and decreases the latency and utilization load on the CPU. With this performance improvement, for example, oil and gas refineries can pinpoint drill locations in half the time. And weather services can run climate simulations up to six times faster to identify extreme weather patterns.

GDS provides value in many ways:

Bandwidth is two to eight times higher with data transfers directly between storage and GPU.
Latency is lower, because data transfers don’t fault and don’t go through a bounce buffer.
Access to petabytes of storage can be at higher bandwidth than with local storage or local CPU memory.
Use of DMA engines near storage is less invasive to CPU load and doesn’t interfere with GPU load.
The GPU becomes the highest-bandwidth computing engine.
Bandwidth into GPU memory from CPU memory, local storage, and remote storage can be additively combined to nearly saturate the bandwidth into and out of the GPUs.

NetApp AI Solutions for NVIDIA DGX A100 systems

The NVIDIA DGX POD reference architecture combines NVIDIA DGX A100 systems, NVIDIA InfiniBand networking, and storage solutions into fully integrated offerings that are verified and ready to deploy. As a key NVIDIA partner, NetApp offers two solutions for DGX A100 systems. One is based on NetApp^® AFF systems, and the other is based on NetApp EF-Series EF600 arrays with BeeGFS.

If your enterprise plans to run many distributed jobs using GPUs, and if you plan to use NFS and the rich data management available in NetApp ONTAP^®, AFF solutions are a great fit. If you have fewer jobs using GPUs for long-running training operations and require the extreme performance of a parallel file system, consider NetApp E-Series solutions. Both solutions are accompanied by a reference architecture that includes observed bandwidth, IOPS, and training performance results under certain testing conditions. And ONTAP AI is also available in an integrated solution, with your choice of three preconfigured offerings that include installation and support.

Magnum IO GPUDirect Storage enables data to move directly from the NetApp EF600 systems into GPU memory, bypassing the CPU. Direct memory access from storage to GPU relieves the CPU I/O bottleneck, increasing performance.

BeeGFS is a parallel file system that provides great flexibility and is key to meeting the needs of diverse and evolving AI workloads. Today, NetApp EF-Series storage systems supercharge BeeGFS storage and metadata services by offloading RAID and other storage tasks, including drive monitoring and wear detection. BeeGFS GDS with EF-Series for both DGX POD and NVIDIA DGX SuperPOD configurations will be generally available in the near future but can be used now for proofs of concept. Support for ONTAP AI will follow later in the year. To learn more, visit www.NetApp.com/ai.

Mike McNamara

Mike McNamara는 NetApp의 제품 및 솔루션 마케팅 분야의 고위 경영진이며 25년이 넘는 데이터 관리 및 클라우드 스토리지 마케팅 경험을 보유하고 있습니다. 10년 전 NetApp에 입사하기에 앞서, McNamara는 Adaptec, Dell EMC, HPE에서 근무했습니다. McNamara는 자사 클라우드 스토리지 오퍼링 및 업계 최초의 클라우드 연결형 AI/ML 솔루션(NetApp), 유니파이드 스케일아웃 및 하이브리드 클라우드 스토리지 시스템 및 소프트웨어(NetApp), iSCSI 및 SAS 스토리지 시스템 및 소프트웨어(Adaptec), 파이버 채널 스토리지 시스템(EMC CLARiiON)의 출시를 이끈 핵심 팀 리더입니다.McNamara는 Fibre Channel Industry Association에서 마케팅 의장을 역임한 경력 외에도 Ethernet Technology Summit Conference Advisory Board와 Ethernet Alliance에서 회원으로 활동하고 있으며, 업계 저널의 고정 기고자로 활동하며 여러 행사에서 연설을 맡기도 했습니다. McNamara는 또한 FriesenPress에서 'Scale-Out Storage - The Next Frontier in Enterprise Data Management'라는 책을 출간했으며, Kapos가 선정한 눈 여겨 볼 상위 50대 B2B 제품 마케터에 이름을 올렸습니다.

Mike McNamara의 모든 게시물 보기