跳轉至主要內容

Boost performance with NVIDIA Magnum IO GPUDirect Storage

twp people working together
Table of Contents

分享本頁

Mike McNamara
Mike McNamara
590 瀏覽

NVIDIA Magnum IO GPUDirect Storage (MIO GDS) enables a direct data path for direct memory access (DMA) transfers between GPU memory and storage, which avoids a bounce buffer through the CPU. The direct path increases system bandwidth and decreases the latency and utilization load on the CPU. With this performance improvement, for example, oil and gas refineries can pinpoint drill locations in half the time. And weather services can run climate simulations up to six times faster to identify extreme weather patterns.

GDS provides value in many ways:

  • Bandwidth is two to eight times higher with data transfers directly between storage and GPU.
  • Latency is lower, because data transfers don’t fault and don’t go through a bounce buffer.
  • Access to petabytes of storage can be at higher bandwidth than with local storage or local CPU memory.
  • Use of DMA engines near storage is less invasive to CPU load and doesn’t interfere with GPU load.
  • The GPU becomes the highest-bandwidth computing engine.
  • Bandwidth into GPU memory from CPU memory, local storage, and remote storage can be additively combined to nearly saturate the bandwidth into and out of the GPUs.

NetApp AI Solutions for NVIDIA DGX A100 systems

The NVIDIA DGX POD reference architecture combines NVIDIA DGX A100 systems, NVIDIA InfiniBand networking, and storage solutions into fully integrated offerings that are verified and ready to deploy. As a key NVIDIA partner, NetApp offers two solutions for DGX A100 systems. One is based on NetApp® AFF systems, and the other is based on NetApp EF-Series EF600 arrays with BeeGFS.

If your enterprise plans to run many distributed jobs using GPUs, and if you plan to use NFS and the rich data management available in NetApp ONTAP®AFF solutions are a great fit. If you have fewer jobs using GPUs for long-running training operations and require the extreme performance of a parallel file system, consider NetApp E-Series solutions. Both solutions are accompanied by a reference architecture that includes observed bandwidth, IOPS, and training performance results under certain testing conditions. And ONTAP AI is also available in an integrated solution, with your choice of three preconfigured offerings that include installation and support.

Magnum IO GPUDirect Storage enables data to move directly from the NetApp EF600 systems into GPU memory, bypassing the CPU. Direct memory access from storage to GPU relieves the CPU I/O bottleneck, increasing performance.

NVIDIA storage chart

BeeGFS is a parallel file system that provides great flexibility and is key to meeting the needs of diverse and evolving AI workloads. Today, NetApp EF-Series storage systems supercharge BeeGFS storage and metadata services by offloading RAID and other storage tasks, including drive monitoring and wear detection. BeeGFS GDS with EF-Series for both DGX POD and NVIDIA DGX SuperPOD configurations will be generally available in the near future but can be used now for proofs of concept. Support for ONTAP AI will follow later in the year. To learn more, visit www.NetApp.com/ai.

Mike McNamara

Mike McNamara

Mike McNamara 是 NetApp 產品和解決方案行銷的資深主管,在資料管理和雲端儲存行銷領域擁有超過 25 年的豐富經驗。在十年前加入 NetApp 之前,Mike 曾任職於 Adaptec、Dell EMC 和 HPE 等公司。Mike 是推出第一方雲端儲存產品和業界第一款雲端連線 AI/ML 解決方案 (NetApp)、統一化橫向擴充和混合雲儲存系統與軟體 (NetApp)、iSCSI 和 SAS 儲存系統與軟體 (Adaptec),以及光纖通道儲存系統 (EMC CLARiiON) 的重要團隊領導者。此外他曾經擔任「光纖通道產業協會 (Fibre Channel Industry Association,FCIA)」的行銷主席,也是乙太網路技術高峰會議顧問委員會、乙太網路聯盟的成員,現在仍定期為業界期刊撰稿,並經常擔任活動講師。Mike 還透過 FriesenPress 出版了一本名為《橫向擴充儲存設備 - 企業資料管理的未來樣貌》的書籍,並被 Kapos 列為值得關注的 50 名 B2B 產品行銷人員。查看 Mike McNamara 的所有文章
Boost performance with NVIDIA Magnum IO GPUDirect Storage