Run Apache Flink with ONTAP for scalable, low-latency storage

Contact Sales
Welcome!

An account will enable you to access:
- NetApp support's essential features
- NetApp communities
- NetApp training
- Sign in to my dashboard
- Don't have an account?
  Create an account
- BlueXP is now NetApp Console
  
  Monitor and run hybrid cloud data services
  NetApp Console
NetApp account
Language
- English
- Deutsch
- Español
- Français
- Italiano
- Português
- 日本語
- 한국어
- 简体中文
- 繁體中文
See your global contacts
Learn
Browse

Wooden tower on pier reflected in calm lake, reeds, clear sky.

Contents

Share this page

Win Vahlkamp

January 05, 2026

Apache Flink™ is an open-source framework designed for processing data streams and batch data. It is developed by the Apache Software Foundation and is known for efficiently handling both unbounded (continuous) and bounded (finite) data streams. Whereas Apache Spark was originally developed for massive scale batch processing and added real-time processing in the form of micro-batches, Apache Flink was originally developed to process continuous, unbounded streams of data and added batch processing later. Both support unified stream and batch processing, but achieve it in different ways. Apache Flink can maintain consistent states across processing tasks while maintaining high throughput with low latency, making it suitable for applications that require fast responses, such as real-time payment systems, fraud detection, machine learning, and real-time analytics.

Flink includes a fault tolerance mechanism using checkpoints, which allows it to recover from failures without losing data and makes state fault tolerant. Each checkpoint includes per-task state plus metadata persisted to checkpoint storage which is typically a distributed filesystem such as HDFS or S3. NFS can also be used. By default, checkpointing is disabled; it can be enabled with a checkpoint interval in milliseconds. Flink checkpointing interacts with durable data sources such as Kafka so it can replay records for a period of time.

Flink can put enormous pressure on the backend storage. Checkpoints are periodic, incremental events triggered by a configurable time interval or external events to save the state of the stream. If a failure occurs, the streaming job can be restored from the checkpoint. Checkpoints can be configured to happen every N milliseconds and should be externally persisted. It is considered a stream snapshot by Flink and is assigned an ID. Savepoints are conceptually similar to checkpoints, but are user-managed and triggered snapshots used for manual recovery, similar in concept to backups. Savepoints are intended for upgrades or migrations. Checkpoints and savepoints have different but they use the same internal mechanism. If the load is too heavy, queueing can occur, which is registered as job backpressure, which slows down the flow.

To preserve the state in the stream, RocksDB holds keyed state locally on each Taskmanager, which are the worker nodes, to store checkpoint metadata and is snapshotted to the configured filesystem or object store during checkpoints. RocksDB does small, random, latency-sensitive I/O. Local NVMe is strongly recommended; if using remote block storage, sub-millisecond p95–p99 latency is desirable for many workloads, though job-specific requirements vary.

Often, the storage for Flink is across multiple different storage platforms: S3 for “cheap” storage for checkpoints, source, and sink data, local NVMe drives for the RocksDB backend, and maybe NFS storage for checkpoints. With unified storage of NFS, SMB, S3, and NVMe block storage, ONTAP can process the entire streaming flow of data for Apache Flink, consolidating streaming flows onto one universal data storage platform, reducing costs, and producing data products faster, for a lower cost, with higher data quality.

This post explains how to leverage ONTAP’s rich set of storage features to run Apache Flink on NFS or S3 and to serve the backend data on NVMe block storage.

A diagram illustrating a data processing pipeline.

The flow of streaming jobs on Apache Flink involves streaming data from a source, processing by a Flink job informed by business logic, and output to a sink such as a data lakehouse, data warehouse, event streaming such as Apache Kafka™, and other data sinks. It is very common for Flink to consume events from a Kafka topic.

The test environment

The test scenario is a real-time payment system, pictured below. This is a real-world scenario that many customers are using in their services to financial institutions. This is also a IoT pattern, streaming data from sites around the world. This testing emphasized functional validation over performance benchmarking. There is a high degree of variation per use case for Flink performance.

4 node Apache Flink cluster on a Kubernetes cluster with four virtual machine nodes with a pod per VM Kubernetes node
A four-node Kafka cluster also running in our Kubernetes cluster with one pod per VM node
We developed a Java point-of-sale simulator scalable to simulate hundreds of point-of-sale systems swiping transactions
- Those transactions produce to a Kafka topic and the Flink payment processing job consumes those transactions and processes them.
- The transaction schema is lightweight with just 10 columns. Each transaction is small, but such a system likely have thousands or tens of thousands of point-of-sale machines transacting concurrently.
The storage system was a two-node A70 cluster
- There are four different ways to store Flink’s checkpoints: Local disks, mounted block devices over SAN, NFS, or S3
- This ONTAP cluster also served as a data sink, serving as data lake storage with ONTAP S3. Batch jobs were run concurrently with the streaming jobs to aggregate and summarize transactions every 12 hours and load the results into an Instaclustr ClickHouse data warehouse.

A diagram depicting a data processing workflow.

Baseline results

With the simplest configuration of NFS 4.2, nconnect=1 we saw:

Checkpoint latency: single-digit milliseconds which is very fast.
Storage latency:
- 95% of ops less than 1 ms
- Avg of ~0.5 ms latency
- Peak never greater than 5 ms

The surprising takeaway was that storage was not the bottleneck, but client-side TCP parallelism (nconnect) and network bandwidth were. It was very difficult to saturate the A70 cluster.

ONTAP with NFS for Flink checkpoints

Use FlexGroups

Flink checkpointing is highly parallel, file-based (many small files), and bursty across many subtasks. FlexGroup volumes distribute file data across many cluster nodes and parallelize metadata + data operations. In practice, FlexGroup’s distribution lines up perfectly with Flink’s per-task checkpoint behavior. Advanced Capacity Management is recommended.

Optimize NFS with nconnect

Linux NFS clients traditionally use one TCP connection per mount. Flink parallel checkpoints easily overwhelm that single queue. ONTAP NFS with nconnect opens multiple TCP sessions per mount:

vers=4.2,nconnect=4,rsize=1M,wsize=1M,hard,noatime

We experimented with nconnect to find a balance. Start with nconnect=4. Monitor your checkpoint performance. If backpressure occurs or latency is out of spec, increase nconnect and observe the changes. Consider enabling pNFS if limits are being reached on the data LIF. The Flink JobManager does not require a high number for nconnect, so nconnect=2 is recommended for the JobManager mount. 

Use NetApp Trident to orchestrate Kubernetes pod storage with NFS.
pNFS helps scale beyond single-LIF limits.

Network guidelines

For Flink nodes:

10 GbE works but limits aggregate throughput. 25–100 GbE is strongly recommended for large TaskManager counts. Enable RSS (receive side scaling) to distribute network processing across cores and enable MTU 9000 (jumbo frames).

Using ONTAP S3 Instead of NFS for Checkpoints

Flink offers a native S3 filesystem backend for checkpoints and savepoints. ONTAP S3 is an S3-compatible object interface backed by FlexGroup volumes. ONTAP has two modes of operating S3: NAS volumes are a dual access pattern with S3 enabled on any NFS and/or SMB NAS volume, or you can run S3 by itself. When you create an S3 bucket ONTAP will create and manage a Flexgroup for you. When more capacity is needed, ONTAP will grow the Flexgroup volume and shrink it when the bucket(s) the Flexgroup hosts are reduced in capacity.

ONTAP S3 has many -positives. There is no need for NFS client tuning. Object storage scales more naturally for large checkpoint data. It avoids single-LIF NFS bottlenecks. S3 works well for large incremental checkpoints, which is the default for Flink, and it simplifies Kubernetes deployments because there are no NFS mounts - simply access a URL rather than an NFS mount per node. Object protocols are also the most common access method for data engineering, analytics, and AI/ML tools.

By enabling dual access to files and objects, ONTAP S3 eliminates the need to duplicate your storage by copying the data out (and syncing) to a different S3 bucket. ONTAP S3 is strongly consistent and uses the native Flink S3 filesystem integration. ONTAP S3 Snapshots and SnapMirror give a common data protection, replication, and recovery engine across all data sets and types.  

Recommendation  

Use ONTAP S3 when you want cloud portability or a hybrid architecture, checkpoints are very large, you want a checkpoint format compatible with future cloud migrationsUse NFS FlexGroup when you have many small state files  

Many customers use NFS for checkpoints and S3 for savepoints.

RocksDB state backend: Local disks vs. ONTAP

The recommendation from Apache Flink is to use local disks for the RocksDB state backend because RocksDB produces small, random, latency-sensitive reads/writes and local NVMe drives dramatically outperform remote filesystems with these workloads. However, with a proper NVMe SAN ONTAP’s extremely low latency (~0.5 ms) can satisfy this demand.  

Closing thoughts

Flink can easily overwhelm poorly tuned storage. But with ONTAP, you can build a Flink architecture that delivers extremely low latency and scales with your compute cluster size. ONTAP also provides a common data storage platform to eliminate silos of storage. This eliminates duplicate tooling and processes. A standard, unified platform to serve Flink’s demanding streaming and batch data processing needs eliminates risk and simplifies governance as well.

With ONTAP unified data storage you no longer need different storage to run Apache Flink. Enable S3 along with your NFS, SMB, and block storage to satisfy all of Flink’s storage requirements and gain all the value ONTAP brings. Contact your NetApp specialists with any questions about running Apache Flink on ONTAP storage.

Win Vahlkamp

Win is a Data Solutions Architect with over 25+ years of experience in systems architecture and engineering. He is focused on developing open source data solutions across the NetApp’s cloud and on-prem portfolio of products and solutions. Previously, he was an Azure Cloud Solutions Architect, Global Technology Strategist, and Senior Solutions Engineer for some of the world’s largest oil companies. In his spare time, Win reads a lot, especially about Systems of Profound Knowledge (Deming) and the Theory of Constraints (Goldratt), practices Iaido (Japanese sword), and likes to travel.

View all Posts by Win Vahlkamp

Next Steps

Blogs

Brush up on the latest trends and developments in cloud, on premises, and everywhere in between. This is where it all gets real, with a cherry on top.

Get to reading

Community

Explore a wide range of open forums where you can post questions, share answers and just generally get smart on all the NetApp technologies that matter most to you.

Join the discussion