Hybrid cloud is no longer a side project—it’s the operating model for many enterprise IT teams. Applications span on-premises clusters, public cloud compute, Kubernetes, and managed storage services. Amazon FSx for NetApp ONTAP brings enterprise-grade data services to AWS, but it also raises a familiar challenge. How do you keep a clear, actionable picture of performance, risk, and cost across such a mixed estate without slowing down the business?
NetApp® Data Infrastructure Insights (DII) addresses that problem with a storage-first view of the world. It ties application behavior to the data layer, on premises and in AWS, and converts telemetry into insights that matter to operations and finance. Amazon CloudWatch remains essential for AWS resource monitoring, yet it was never designed to deliver deep, storage-centric analytics across hybrid environments. Understanding the difference is where the business value shows up.
The visibility gap in hybrid storage
Fragmented monitoring is expensive in ways that don’t show up on a budget line. Teams lose hours correlating events, production incidents take longer to isolate, capacity planning becomes guesswork, and cloud spending drifts upward due to overprovisioning. A hybrid estate that includes FSx for ONTAP is especially susceptible to these issues because:
- Storage performance is often the hidden dependency. Latency surfaces at the database or API layer, but the root cause lives in a volume, aggregate, or policy group.
- Workloads move. A VM or pod that ran on premises last month might now sit on Amazon Elastic Compute Cloud (Amazon EC2) or Elastic Kubernetes Searvice (Amazon EKS), using FSx for ONTAP. Historical context can vanish if tooling doesn’t follow.
- Ownership crosses teams. Site reliability engineering (SRE), platform engineering, storage, and finance view the same reality through different lenses.
CloudWatch offers strong coverage for AWS resources and is widely adopted. It provides basic FSx for ONTAP metrics, alarms, and dashboards, and integrates with other AWS services. What it doesn’t provide is end-to-end storage observability across on-premises and cloud environments, or deep context specific to NetApp ONTAP® software. That’s where DII steps in.
What Data Infrastructure Insights brings to the table
DII is designed for data infrastructure, not just cloud instances or logs. It discovers, maps, and analyzes storage platforms, virtualized compute, Kubernetes, SAN fabrics, and applications. For a hybrid estate that includes FSx for ONTAP and on-premises ONTAP, several capabilities stand out:
- Unified inventory and topology. DII automatically discovers file systems, storage virtual machines (SVMs), volumes, LUNs, aggregates, and their relationships to hosts, VMs, pods, and clusters. Instead of isolated charts, you get a topology that details who uses what and through which path.
- Deep ONTAP telemetry. Beyond high-level counters, DII collects metrics specific to ONTAP, such as read and write latency, cache hit ratios, quality of service (QoS) policy group behavior, protocol breakdowns, and aggregate utilization. This is technology context, not generic metrics.
- Workload to storage mapping. DII links application entities such as namespaces, deployments, or vSphere datastores to the exact volumes and LUNs that back them. When an SRE sees elevated API latency, the storage team can jump directly to the implicated volume rather than hunting through lists.
- Anomaly detection and performance policies. You can define policies that reflect service-level objectives (SLOs)—for example, latency above a threshold for a critical volume or contention on a specific aggregate. DII flags anomalies and correlates them with changes upstream or downstream in the stack.
- Capacity forecasting and rightsizing. DII analyzes growth trends at the file system, aggregate, and volume levels. It helps plan when to scale FSx for ONTAP throughput capacity, rebalance workloads, or adjust QoS, and it spots underused resources that drive unnecessary spending.
- Cost allocation and showback. By attributing consumption to applications and business units, DII gives finance and engineering a shared picture. That’s critical in hybrid environments where costs are spread across cloud services and on-premises hardware.
- Kubernetes awareness. DII understands persistent volumes, claims, and storage classes. It connects a pod or namespace directly to the underlying FSx for ONTAP volumes or on-premises ONTAP. That connection reduces time to isolate issues in microservices architectures.
All of this is delivered across both cloud and on-premises environments, in one system, and with consistent terminology.
FSx for ONTAP, seen through DII
FSx for ONTAP is managed ONTAP. That means you can expect enterprise-grade NetApp Snapshot™ copies, replication, multiprotocol access, and QoS. It also means the platform emits the rich ONTAP telemetry that DII understands.
With DII, teams get visibility into:
- File systems, SVMs, volumes, and LUNs, including relationships among them
- Read and write latency at the volume and policy group level
- IOPS, throughput, and queue depth, broken down by protocol where applicable
- QoS policy group consumption, to validate service levels and catch noisy neighbors
- Aggregate capacity and performance, to avoid hot spots that lead to tail latency
- Replication relationships and protection status for recovery planning
DII can also map Amazon EC2 instances or Amazon EKS workloads that mount FSx for ONTAP back to the volumes they use. This becomes valuable when incidents span layers. For example, an application team reports timeouts in a payment service that runs on Amazon EKS. DII shows a spike in write latency on the backing FSx for ONTAP volume, correlates it with a batch job that ramped up IOPS in the same QoS policy group, and highlights an aggregate approaching saturation. Instead of a war room, you have a path to action: Move the batch job, adjust QoS, or rebalance volumes.
How this differs from Amazon CloudWatch
CloudWatch is the native way to monitor AWS resources. It excels at collecting and alerting on standard metrics, logs, and events across AWS services. For FSx for ONTAP, CloudWatch surfaces file system and some volume metrics and integrates naturally with services like AWS Lambda, Amazon EventBridge, and Amazon Simple Notification Service. Those strengths are real, and many teams rely on CloudWatch for day-to-day AWS monitoring.
The gaps appear when you need storage-centric visibility across hybrid environments or insights specific to ONTAP.
This isn’t an either-or choice. Many organizations keep CloudWatch for AWS-wide telemetry and alerting while using DII to manage data infrastructure. The two complement each other when each is used for what it does best.
Business outcomes that matter
The point of observability isn’t prettier charts—it’s to support decisions that improve availability, performance, and cost. In a hybrid estate that includes FSx for ONTAP, DII helps in several concrete ways.
- Faster incident resolution. When the storage path is visible and linked to applications, you reduce time spent correlating symptoms to causes. That shortens both detection and resolution, and it lowers the number of teams pulled into incidents.
- Better capacity planning. By forecasting growth at the levels that matter to ONTAP and FSx for ONTAP, you avoid late upgrades and emergency tuning. You can schedule throughput changes, rebalance aggregates, and plan migrations with fewer surprises.
- Cost control without guesswork. DII spots idle or oversized volumes, identifies workloads that would benefit from QoS adjustments, and ties usage to owners. Finance gets clarity, engineering gets actionable data, and you curb overprovisioning.
- Risk reduction. With the Storage Workload Security capabilities that accompany DII, you can detect anomalous user activity against NAS shares and respond quickly to insider threats. Tying security signals to the storage layer shortens response time.
- Smoother modernization. During migrations to FSx for ONTAP or refactoring onto Amazon EKS, DII shows how workloads behave before and after the move. That evidence base supports cutover decisions and reduces rollback risk.
- Stronger cost accountability. Showback drives better behavior. When teams see how their choices affect performance and cost, they engage in tuning and cleanup on their own.
Practical steps to get value quickly
A platform like DII earns its keep when it becomes part of daily operations, not a side console. A few practices help accelerate time to value.
- Connect both sides of hybrid. Include on-premises ONTAP, VMware, and SAN and your FSx for ONTAP file systems. The cross linking is where the insights live.
- Tag with intent. Consistent application and business unit tags on Amazon EC2, Amazon EKS, and volumes make it easier to allocate cost and build useful dashboards.
- Start with golden paths. Build a small set of dashboards and policies around your top five applications. Cover latency, IOPS, QoS policy group headroom, capacity, and protection status.
- Set meaningful performance policies. Tie thresholds to user experience, not just device limits. Alert about conditions you would act on, and route them to the right teams.
- Integrate with IT service management. Create incidents automatically for critical storage policy violations. Closed-loop workflows turn monitoring into action.
- Teach the map. Make the topology a standard stop in incident review. When teams use the same map, collaboration improves.
Bringing it together
Amazon FSx for NetApp ONTAP gives you mature data services in AWS. That investment pays off faster when you pair it with observability that understands storage and spans your hybrid estate. NetApp Data Infrastructure Insights provides that view and links it to both operations and finance. CloudWatch remains useful for AWS, but it won’t replace storage-centric analytics, cross-domain mapping, or hybrid planning.
The result isn’t just fewer pages at 2 a.m. It’s a steadier platform for growth. Applications move with confidence, teams spend less time on guesswork, and the business sees clearer trade-offs between performance and cost. In a world where data is the substrate of every service, that is the visibility that counts. Request a demo and free trial to see firsthand how Data Infrastructure Insights empowers your FSx for ONTAP estate.
No ransomware detection or prevention system can completely guarantee safety from a ransomware attack. Although it’s possible that an attack might go undetected, NetApp technology acts as an important additional layer of defense.