AI Beyond Limits - The data platform AI has been waiting for
In January 2026, at SCA HPC ASIA 2026 held in Osaka, NetApp exhibited under the banner “AI Beyond Limits — The Data Platform AI Has Been Waiting For.” At the Exhibitor Forum, I presented on the theme “Empowering AI for Science with intelligent data infrastructure,” demonstrating the need to shift HPC design philosophy from compute‑centric to data‑centric (AI for Science) and the value of intelligent data infrastructure that supports this transition.
In the pre‑AI era, HPC systems were designed to extract maximum compute performance and minimize data movement. This approach has produced many achievements and remains an important foundation for scientific research today; however, in an era where AI runs as the core engine of research on a daily basis—AI for Science—the premises change.
The bottleneck in research is not peak FLOPS, but the ability to quickly find the necessary data, contextualize and prepare it, share it safely while preserving authorization and lineage, and deliver it to GPUs without delay—in other words, the flow of data itself. The workflow is shifting from traditional one‑shot (batch) execution to a continuous, iterative cycle in which simulation, data, and AI keep turning, and HPC is operated not as a standalone machine, but as a service securely and seamlessly integrated across on‑premises environments, external institutions, and public clouds.
The real bottleneck is not compute but data. GPUs continue to advance in performance and density and have become more readily available than before. Even so, what frequently happens in the field is the opposite: ‘GPUs waiting for data.’ What is jammed is data movement, preparation, management, and reuse. In fact, it is predicted that by 2026, 60% of AI projects will be abandoned due to a lack of AI‑ready data. To boost outcomes in research that presupposes AI, it is necessary to place at the center of design the requirement that data ‘keep flowing’ without stagnation.
Four conditions that AI for Science requires of data
To truly make AI effective in science, data must satisfy the following conditions:
In addition, it must support sustained operations that presuppose long‑term research, cross‑institutional collaboration, and hybrid/multi‑cloud. This is not a challenge that can be solved with short‑term symptomatic treatments, but a requirement that questions the very philosophy of the infrastructure.
NetApp’s intelligent data infrastructure is not just for storing data but for ‘understand and handle’ it. It brings together data distributed across laboratories, HPC, and major clouds, providing a foundation that continuously and efficiently ‘feeds’ data to AI and HPC. IDI consists of the following three pillars:
Any Data, Any Place: Integrate data wherever it lives to eliminate silos. Access and share with consistent practices while maintaining permissions and lineage
Active Data Management: Build security, governance, and compliance into the mechanism to ensure accountability and assurance
Adaptive Operations: Autonomously follow load and environmental changes to continuously optimize the balance of performance, efficiency, and cost
And to realize and further strengthen this intelligent data infrastructure, NetApp provides the following two latest solutions:
To run AI‑driven science at production scale, storage must reliably keep pace with GPUs. NetApp offers NetApp AFX. With its High-Performance, ultra acalable, disaggregated architecture, it fundamentally eliminates AI/HPC I/O bottlenecks. Its key features include:
No matter how powerful the GPUs and storage are, if the data is not prepared in an AI‑ready state, AI will not function well. NetApp AI Data Engine is the world’s first to equip storage with GPUs, integrating data detection, curation, policy‑based guardrails, and real‑time vectorization for generative AI, thereby enabling seamless AI workflow operation and scaling by delivering AI‑ready data. While unifying data assets and keeping them always up to date, it enables high‑speed data access, more efficient data transformation, and trustworthy governance. The main functions of AIDE are:
As a result, it greatly reduces the time required for data preparation and raises project throughput. Even in long‑term, collaborative, hybrid environments, it enables operations centered on always‑fresh, highly trustworthy AI‑ready data.
Long‑term, collaborative, hybrid‑cloud will become the ‘new normal.’ In Japan and across Asia, the convergence of HPC and AI is steadily progressing. To meet the complex real‑world requirements of long‑term studies, joint research, and hybrid/multi‑cloud, trustworthy data, reusable data, consistent data, and governed data are indispensable. With IDI, AFX, and AI Data Engine, NetApp simultaneously achieves integration, protection, optimization, and utilization of data, supporting sustained value creation in AI for Science.
Examples of Scientific Research Using AI for Science
From compute‑centric to sata‑xentric: What AI for Science demands is data ‘continuity
In the compute‑centric era, HPC was evaluated by performance; in the data‑centric era of AI for Science, what is demanded is the ‘continuity’ of data. Outcomes are not determined merely by the size of compute resources or the existence of data. Data exerts power as a research foundation only when it is organized, properly updated, protected, and kept in an AI‑ready state that can be reused without interruption.
The NetApp intelligent data infrastructure sustains that ‘flow,’ AFX guarantees unstoppable high‑speed supply, and NetApp AI Data Engine continuously provides well‑prepared AI‑ready data through Find/Sync/Govern/Curate. Researchers are freed from the complexity of data management and can concentrate on true scientific discovery.
NetApp Session Video Link
Take a look at the video of the NetApp session presented by the author at the SCA HPC ASIA 2026 Exhibitor Forum.
AI for Science begins with data. And data begins with NetApp.
Notes
Masahiro Waki is responsible for strategic business development related to AI and DX in NetApp Japan. He is engaged in industry activities and alliance activities with technology partners such as academic and research institutions, HPC, IOWN, life sciences, and M&E. He has 15 years of experience working overseas in the U.K., France, India, and the U.S. during his previous position at Sony. He has promoted business development in storage, data infrastructure, broadcast media, and others. Attracted by NetApp's "Data Fabric" concept, he moved to his current position in 2021. He is a member of the board of directors of the Association of Motion Picture and Television Engineers of Japan, Vice Chairman of the RDM and Cloud Subcommittee of AXIES (Association for the Promotion of ICT in Universities). He is also a member of LLM-JP, LINK-J and the Society for Digital Archiving.