Menu

Collaborate better, everywhere: Data caching with Amazon FSx for NetApp ONTAP

Table of Contents

Share this page

Yifat Perry
Yifat Perry

Today’s ever-expanding data estates and distributed teams working remotely have changed the demands put on data. Retrieving data rapidly and collaborating effectively on shared datasets across diverse environments is essential, no matter where the data is hosted.

That kind of data distribution can be a nightmare to orchestrate, with issues of data integrity, incompatibility, and performance all causing difficulties. Organizations need ways to accelerate data access and promote data agility and collaboration without those challenges.

NetApp and AWS have partnered to offer the solution: data caching with Amazon FSx for NetApp ONTAP. This post explores the data caching capabilities of FSx for ONTAP that help address the challenges of working with data that is dispersed across different locations and environments.

Read on, or use these links to jump down:

The challenges of distributed data

Distributed data presents several challenges that organizations need to overcome:

  • Data consolidation. When data is spread across locations, you need to consolidate the data from various sources. If you can’t view the data coherently as a single file system, you can’t efficiently read and write the data—or analyze it to derive any meaningful business value from it.
  • Multiple namespaces. Namespaces across data in different sources may not all follow the same conventions. To access and use data consistently, you need to unify the different naming conventions and structures, but this process can require intensive effort.
  • Performance tradeoffs. Making data available to all your users can be challenging depending on the number of users, the distance between them and the data, and the various access protocols in use. As an example, the further users are from the dataset, the more latency they experience. You need a fine balance between low-latency access, optimized bandwidth, and cost. Creating data silos across different environments and geographies isn’t a solution, although it’ll give local users faster access to data, it causes synchronization problems.
  • Data replication. Data that’s replicated across multiple environments needs to be consistent and up to date. The biggest risk is that discrepancies can arise during replication, which opens the door for a lack of data integrity.
  • Cost increases. Distributed data presents some cost-related challenges. When full copies of data are hosted in multiple locations—for instance, in the cloud, in the edge, and in different data centers—all of them come with their own costs attached. Costs associated with data transfers and central management also need to be considered.

Navigating the complexities of working with distributed data can be challenging. That’s where FSx for ONTAP can help.

Data caching with FSx for ONTAP

FSx for ONTAP is the fully managed storage service from AWS that delivers trusted NetApp® ONTAP® data management solutions.

FSx for ONTAP is equipped with data caching capabilities that enable faster access to data and seamless, real-time collaboration across multiple environments.

With FSx for ONTAP you can create a writable, persistent cache in a remote location with the latest, most consistent and coherent copy of your data. These sparsely populated, writable cached volumes can be used to create a cache on the same system or a different one for quicker data access. NetApp FlexCache® technology makes this possible.

How FlexCache data caching works in FSx for ONTAP:

A diagram showing applications and compute resources interacting with file systems A and B. File system A has an origin volume with labeled data blocks. A read request for blocks L3, G2, and P2 is fulfilled from the cache volume in file system B, which contains a subset of the origin volume's data for efficient retrieval.

The cached data is accessible over network file system (NFS) and SMB/CIFS, which means you can use the cache data without re-architecting your systems in any way. This data is beneficial in read-intensive environments where data is shared by multiple hosts and accessed more than once.

To optimize the size of the cached data copy, only the data read by the client is cached. Clients can mount any of the volumes to access the same prepopulated, up-to-date data from multiple locations. The cached volume acts as a temporary storage location between a host and the data source, and it stores the frequently accessed data chunks so they can be served faster than fetching from the source.

Use cases for data caching with FSx for ONTAP

FSx for ONTAP’s data caching capabilities help accelerate performance and reduce operational complexity in scenarios where shared data needs to move seamlessly between locations and environments, including:

  • Data collaboration across global teams
    When teams are distributed across regions or remote business units, collaborating on large, shared datasets becomes challenging. FSx for ONTAP solves this by caching only the active portions of data where it’s needed. This supports fast, real-time collaboration across geographies and hybrid architectures without the need for full replication.
  • Bursting between environments
    Organizations with periodic or unpredictably compute-intensive workloads can burst to FSx for ONTAP or from FSx for ONTAP back to on-premises ONTAP environments depending on the changing compute needs in various project phases.
    • On-premises data centers to FSx for ONTAP: Industries such as electronic design automation (EDA), high performance computing (HPC), and AI and machine learning (AI/ML) can rely on the AWS Cloud elasticity during simulation, training, or modeling phases. Data caching only sends the necessary working portions of datasets to the FSx for ONTAP, reducing data movement and overall costs.
    • FSx for ONTAP to on-premises data centers: In media and entertainment, and gaming workflows, rendering and asset generation can be performed on AWS, while creative and QA teams access results from on-premises locations. Caching these results locally supports faster review cycles and reduces cloud egress costs.
  • Remote office or branch office (ROBO) access
    Enterprises with many branch sites often struggle with slow access to centralized file systems. FSx for ONTAP caches working data to the edge—providing local performance with central control. ROBO teams can work as if the data is stored locally while staying synchronized with core systems aligned with regulatory requirements and governance rules.

Benefits of data caching with FSx for ONTAP

With FSx for ONTAP, you have a low-overhead solution for all your data caching requirements:

  • Accelerated data collaboration across teams and regions. Distributed teams can quickly and consistently access the same datasets without duplicating data. Whether working from edge locations or centralized offices, users experience local-like performance while staying in sync with the authoritative source, improving team productivity and reducing rework.
  • Quick access to remote data. Data caching makes remote data available closer to users—with minimal or no additional architectural requirements.
  • High performance. Data caching with FSx for ONTAP eliminates the latency challenges associated with accessing data from across the globe—without compromising data integrity or quality.
  • File locking. The FSx for ONTAP file-locking mechanism prevents parallel write operations that might cause data integrity problems.
  • Zero-touch setup. Datasets in all the different environments, both cached and at the origin, are kept consistent by FSx for ONTAP with no effort.
  • Data protection and resilience. FSx for ONTAP is highly available and resilient by default, using either a single or multiple availability zones to maintain uptime. With its automated cross-regional backup and disaster recovery features, data is available even if corruption or regional disasters occur.
  • Single namespace. FSx for ONTAP solves the namespace issue that occurs when data is stored in multiple locations. Data can be consolidated and accessed through a single namespace without the need for any infrastructure consolidation.
  • Reduced storage costs. Data caching with FSx for ONTAP saves space because it caches only active data, not full copies. Plus, built-in FSx for ONTAP storage efficiency features work with intelligent file caching. That reduces both storage and transfer costs.

Manage caching relationships in NetApp Workload Factory

Workload Factory is the free service for planning, deploying, and optimizing FSx for ONTAP resources so that they align with specific workload requirements and AWS Well-Architected Framework principles. When it comes to data caching, Workload Factory can help too.

You can manage FlexCache relationships with Workload Factory to maximize the value of the caching technology and streamline workflows.

From the Storage workload, users can access FSx for ONTAP file systems and see details of their FlexCache relationships in the Cache relationships tab. Here, you can discover all the existing cache relationships that have been created for an individual file system, including the cache volume origin and target, the cache status, and the cache size.

FSx for ONTAP environment details overview

With FlexCache management actions, users can edit and control cache configurations, including:

  • Changing a cache name, size, export policy, or caching method.
  • Pre-populating a cache.
  • Deleting a cache.

How CoMix Wave Films used FSx for ONTAP and data caching to accelerate movie production

CoMix Wave Films, the acclaimed Japanese animation studio behind Suzume no Tojimari, leverages FSx for ONTAP to power its high-performance production workflows. Known for producing visually intricate anime feature films, the studio faces intense infrastructure demands in the months leading up to each release.

To meet the performance needs of its animation pipeline while maintaining agility, CoMix Wave Films adopted a hybrid storage architecture that combines FSx for ONTAP with a local NetApp on-premises ONTAP system. Using FSx for ONTAP data caching, the team bursts data stored cost-efficiently on FSx for ONTAP to the on-premises system, which serves as a local cache. This solution provides fast, write access to the AWS-based data from the studio’s local workstations while keeping the data synchronized.

With this approach, the production team can interact with high-resolution project files at local speeds, while maintaining a single authoritative dataset on AWS. As a result, artists and editors can collaborate seamlessly and efficiently without latency, and the studio can scale storage capacity on demand during peak production periods.

Once the final edits are complete, production files are consolidated in FSx for ONTAP for seamless distribution to theaters and digital platforms. The solution supports both creative speed and operational efficiency, helping CoMix Wave Films deliver on its vision, on time and at scale.

Read the CoMix Wave Films customer story.

Bring your data and teams together with FSx for ONTAP

Your teams need a way to collaborate across your entire data estate without running into delays or creating data silos that drive up costs and harm data integrity. For a diverse data estate, it’s easy to do that with FSx for ONTAP and Workload Factory.

FSx for ONTAP uses data caching features powered by the NetApp FlexCache technology to deliver data caching as a seamless part of a first-party AWS service.

Build reliable distributed data architectures, keep your users in sync, and stop costs from spiraling out of control.

For more information, visit Caching data using Amazon FSx for NetApp ONTAP. And learn how to create a cache in the NetApp Documentation.

Drift chat loading