Menu

Bridging data gravity and cloud ambition

AI training: NetApp and Union.ai

bridge
Contents

Share this page

Sathish Thyagarajan
Sathish Thyagarajan
78 views

What do you do when the future of AI seems to be in the cloud, but your data is firmly rooted on the ground? It isn’t just a technical riddle; it’s a business and ethical crossroads confronting every organization that aspires to AI at scale. 

I’ve spent the past few years immersed in conversations with customers across healthcare, financial services, biotech, life sciences, and manufacturing. Their stories are remarkably consistent: The data that matters most—such as medical images, transaction histories, and confidential, proprietary designs—all resides on premises, protected by layers of governance, privacy, and sheer volume. Yet the promise of AI has never been more compelling, and the cloud offers an almost gravitational pull with its elastic compute, scalable GPUs, and speed to innovation. 

But as with all things in technology, the path forward is rarely linear. Instead, it’s about blending strengths, respecting constraints, and, sometimes, rethinking what it means to “move” at all. Together, NetApp and Union.ai can help you navigate between your AI goals and where your data lives. 

The unmovable-data dilemma

There’s a growing perception that the road to AI excellence simply involves pouring all your data into the cloud and then letting algorithms go to work. If only it were that simple. In reality, data gravity is a hurdle. Regulatory compliance, cost, and the sheer physics of petabyte-scale archives mean that for many organizations, data stays put by necessity, not by choice. The question, then, is not whether to migrate, but how to innovate with what you have and where you have it.

I often hear from teams who feel caught in the middle. They want to experiment, to iterate, and to bring the power of modern AI models and access to in-demand GPUs to bear on their most valuable data assets. But the roadblocks, which include technical, regulatory, and operational, are daunting. Should the teams wait for new hardware to be purchased and deployed? Should they attempt a risky migration? Or is there another way?

Rethinking hybrid: Where data stays and compute roams

The answer increasingly lies in hybrid thinking—which is not a compromise, it’s a new foundation. What if your data could remain secure and compliant, right where it belongs, while your AI workloads stretch their legs in the cloud? 

This is the thinking behind the integration of the Union.ai orchestration platform with NetApp® FlexCache® software and Kubernetes-native storage provisioning through NetApp Trident™ software. It’s an approach that doesn’t force a binary choice. Instead, it quietly bridges the gap, allowing cloud-based AI training jobs to access on-premises data without requiring that massive data to be relocated or replicated. 

NetApp FlexCache, a caching technology, creates a read-optimized cache volume in the cloud that’s logically linked to your primary NetApp ONTAP® volume on premises, and it brings faster throughput with a smaller footprint. When a training job that’s running in the cloud accesses a file, FlexCache transparently retrieves it from the origin volume if it’s not already cached. From that point forward, reads are served locally from the cache. Writes, if any, are immediately passed through to the origin, maintaining consistency and compliance. 

Union.ai orchestrates training workflows in both on-premises and cloud-based training workflows. These jobs request persistent volumes through the NetApp DataOps Toolkit, which provisions a FlexCache volume on ONTAP and then binds it to a Kubernetes persistent volume claim (PVC). To the training job, the data appears as a mounted file system path, with no awareness of whether the data is cached or fetched on demand. The result is a seamless hybrid experience: Data remains governed and secure on premises, while compute scales elastically in the cloud. 

This hybrid approach works because of the orchestration intelligence that Union.ai brings to the table. Union.ai is the company behind Flyte, the open-source orchestration platform that’s trusted by more than 3,500 organizations to power their most critical AI and data workflows. As the original creators and core maintainers of Flyte, Union.ai brings deep expertise in scalable, reproducible, and cloud-native AI/ML operations. This platform enables your teams to define, to schedule, and to execute complex AI pipelines with ease, whether the pipelines are running on premises, in the cloud, or across both. Union.ai helps your organization move faster with confidence, offering enterprise-grade support, governance, and extensibility on top of the proven Flyte foundation. In hybrid environments, Union.ai orchestrates your AI training and AI inference workflows seamlessly across infrastructure boundaries, so your data scientists and ML engineers can focus on innovation and not infrastructure. 

Together with NetApp FlexCache and Trident, Union.ai Control Plane orchestrates workflows and enables a hybrid AI training model that is not just technically feasible, but also operationally elegant. This joint solution and this blog co-written with me by David Espejo at Union.ai, factors these realities of enterprise environments while unlocking the agility and scale of the cloud.

NetApp Union Diagram

From theory to practice: A day in the life

Let’s consider a healthcare organization that is training an AI model on sensitive medical images and is adhering to HIPAA regulations. In the past, this training might have meant months of planning, risk assessments, and ultimately slowing the decision to proceed. But in this hybrid model, the workflow becomes almost routine.

The data remains protected on premises in a NetApp ONTAP system. A FlexCache volume is created in the cloud—on Amazon FSx for NetApp ONTAP and Google Cloud NetApp Volumes, or in a self-managed ONTAP cluster. Trident, the Container Storage Interface (CSI)–compliant storage orchestrator, is deployed in both the on-premises and the cloud Kubernetes environments. It provisions persistent volumes from the FlexCache back end in the cloud cluster.

There’s no drama, no heavy lifting. Just a smooth handshake between storage, orchestration, and compute—letting your teams focus on science, not infrastructure.

The subtle power of not moving

What’s most striking about this approach is how it reframes the conversation about AI infrastructure. It’s not about “lifting and shifting” or about maintaining silos. It’s about building an intelligent data infrastructure that respects the realities of governance, cost, and performance, while still unlocking the potential of cloud-scale compute.

It isn’t just a technical win. It’s a cultural shift. Your data scientists and AI engineers are empowered to iterate and to explore, free from the friction and delay of infrastructure constraints. Your compliance teams also rest easier, knowing that control and auditability are preserved. And your business leaders gain a faster path to AI-driven outcomes, without compromise.

Looking ahead

As AI continues to shape the future, the organizations that succeed will be those that find harmony between where their data lives and where their ideas can flourish. Hybrid training with Union.ai and NetApp FlexCache is one example of this new balance and one that doesn’t force a false choice between innovation and control. 

And the hybrid model is just the beginning. As AI evolves, so will the infrastructure that supports it. Expect to see tighter integrations, smarter caching strategies, and even policy-driven workload placement, where orchestration systems like Union dynamically decide where to run each part of your AI pipeline based on cost, latency, data security, or compliance. The future isn’t just hybrid, it’s adaptive. 

Is your organization grappling with the realities of data gravity and looking for a way to move faster without moving at all? Consider what’s possible when you let your data stay rooted and let your AI ambitions take flight. 

Start building the future of hybrid AI, on your terms. Explore more about NetApp AI solutions and Union.ai.  

Sathish Thyagarajan

Sathish joined NetApp in 2019. In his role, he develops solutions focused on AI at edge and cloud computing. He architects and validates AI/ML/DL data technologies, ISVs, experiment management solutions, and business use-cases, bringing NetApp value to customers globally across industries by building the right platform with data-driven business strategies. Before joining NetApp, Sathish worked at OmniSci, Microsoft, PerkinElmer, and Sun Microsystems. Sathish has an extensive career background in pre-sales engineering, product management, technical marketing, and business development. As a technical architect, his expertise is in helping enterprise customers solve complex business problems using AI, analytics, and cloud computing by working closely with product and business leaders in strategic sales opportunities. Sathish holds an MBA from Brown University and a graduate degree in Computer Science from the University of Massachusetts. When he is not working, you can find him hiking new trails at the state park or enjoying time with friends & family.

View all Posts by Sathish Thyagarajan

Next Steps

Drift chat loading