BlueXP is now NetApp Console
Monitor and run hybrid cloud data services
Thank you so much for being here today. Welcome. We are talking monitor modernizing your infrastructure with clarity and confidence. My name is Casey and I am a Product Marketing Manager for our Data Infrastructure Insights Monitoring and Observability tool. I'm very excited to be your moderator this afternoon or this evening or this morning, wherever you are joining us from. And, uh, we have a great group of speakers speaking with us today. Uh, my colleagues, Ronnie Friend, who's a senior manager for Global Strategic Technology Solutions, and Michael Peppers, who is our senior technical product marketing manager for ASA, our one of our block storage solutions. I'll give them a quick opportunity to introduce themselves. Ronnie. Go ahead. Hey, everyone. Nice to meet you all. And as Casey noted, I'm part of the go to market for, um, DII and being spending most of my career in it and a little bit over 20 years and being a customer, um, and working around a NetApp and compute VMware Unix kind of growing up from there to be a partner and been in different roles and sales and,whatnot. Now I'm part of the product team. So happy to be here. And my goal in this session is to share with you some of what we are learning from other customers. A lot of the conversations we are having some best practices. So happy to share, you know, the information and that we have and how we do things and with our key customers. Hi, I'm Michael Peppers, I'm a senior technical product manager for ONTAP San. UhI and another uh, PM for San, uh, basically managed the ONTAP San roadmap. Um, Uh, and, uh, all aspects of development for, uh, ONTAP San. Um, I started at NetApp, uh, 19.5 years ago and have worked pretty much exclusively on various forms of San, uh, during that time, starting in support, uh, going through engineering, then TM, and now finally senior technical product manager. I'm fairly passionate about ONTAP San generally, and, uh, excited when I see things, uh, that can be brought to bear that improves both the offering of ONTAP San as well as your experience, uh, andoptimizes, uh, how well ONTAP San works for you. Uh, and I'm excited about this because I think we have found one of those. Very excited to be here with you all today as your moderator and our agenda today, we're going to be just talking a little bit about what we're seeing with the evolving sand landscape challenges we hear from customers when it comes to sand management. Uh, a little overview on our block storage offerings and then how NetApp can really help enhance visibility and deliver insights for teams. And then we'll look at some real world outcomes from here. I'll let Michael kind of take things over and walk us through what we're seeing with the evolving sand landscape. Okay. So sand is by far the bigger of the two, uh, large things in, uh, network attached or storage area networks. So either one of those, uh, the two big buckets are going to be sand, which is the older of the two. And then, uh, NetApp actually invented and introduced Nnas. And so, uh, the difference between San and NAS, an easy way to describe that is to simply ask the question who owns the file system? If we. The networking and storage piece own the file system, then we're a NAS. In that circumstance, if the file system is owned by a server that we're presenting, uh, blocks to that they format and partition, then we are a San. They own the file system. So basically the difference between the San and the NAS isthe San. I'm presenting a raise of 4K blocks that you then get to partition and format however you like and put down a file system as opposed to NAS, where I actually am a volume that has one or more directories underneath that and a number of files. And so I'm presenting them to you much the same way that a file server would. If that's the circumstance, then you are NAS. The one other possibility, uh, at least with, uh, ONTAP isif we're serving S3, then in that case we are doing object. All of those are possible with unified. Uh, the other possibility for object specifically would be uh, the you can buy specialty equipment, uh, like thewhat is it? Storage, uh, grid. Mike, can I have a question for you? Of course. I hear often that San is dead. What's your take on that? I think it was Mark Twain that said something about rumors of my death seem to be, uh, a little bit, uh, early. Uh, I would certainly say that with regard to, uh, San. Uh, San is currently approximately a $20 billion a year market. Um, companies that are that,want very high availability, high performance, and the ability to scale up a networked storage, uh, forthemselves typically are going to choose sand for that, uh, about three times the amount of storage as sand versus NAS. It actually is growing, uh, currently at around an 8% clip. But the reason why customers choose sand, uh, overwhelmingly is,that if you're looking for very low latency, very high performance, consistent performance, uh, fairly high security, the ability to scale easily and the ability to interrupt operate with virtually any application. Uh, sand is going to be your answer. The reason is,that when we are operating as a sand, we're essentially, uh, providing, uh, Luns or advertising Luns to servers that have access to us. Server is going to look at the Lun, and it's going to look not very much different than if you took a hard drive and installed it in the server. What the hard drive is going to show is going to be pretty much the same as what you're going to get with the Lun is,I'm going to say, here are blocks zero through 5000 or 500,000 or 5 million. You format them, you partition, lay down a file system, and then you keep track of. We'll continue to manage those blocks, but we're just managing 4K blocks. We don't know which ones are related to one another. And if you connect blocks one through five you have this file or block ten, which maybe is a subdirectory or something. Uh, as the San, we're not aware of that. Uh, we're just presenting blocks, uh, as you request them. You owning the file system will determine determined that I want this file, which translates to blocks one, two, three, and four. Uh, so I request those. Uh, we respond by providing, uh, those blocks. And the bottom line is,by doing that, we tend to have very high interoperability. We can scale to, uh, you know, very large. And we tend to be very fast with very low latencies. And so, no, I would say that San is absolutely not dead. Uh, by comparison, uh, the NAS market, which is the next biggest market, is about a $6 billion market. We have about 30% share of that market versus about a 5 or 7% share of the San market. So there's not only is it growing faster, not only is it much bigger, but there is a lot more opportunity, uh, to grow our share. Got it. So I learned something that the market, thegrowing of the market, but it's also linked to the survey results, where people are more curious and interested about the performance and the availability, which is aligned with what you said. Thank you for that. Uh, you're very welcome. Uh, San is typically, you know, it's background. Uh, so it is not flashy the way perhaps I might be or Cloud offerings. Not very many years ago. Uh, you know, there's always a new bright, shiny thing. Uh, San has not been a, uh, new bright, shiny thing since, uh, I don't know, the late 70s or early 80s perhaps. Uh, it's just been around and foundational. Um, basically, I think I've covered a lot of the things in this slide. Seamless integration. Uh, you can integrate sand pretty easily into a majority of, uh, workloads using, uh, a variety of operating systems and applications. And the reason you do that is,that I'm, uh, as the sand target. I'm essentially speaking a almost universal language in that I'm basically giving you a set of blocks that are encoded in ones or zeros and arranged into, uh, bits and bytes and so on, in a way that virtually all OSes and applications are going to, uh, either recognize or abstract up to the application in such a way that you wouldn't know what was behind, uh,the server for storage, in any case, um, that you also can use sand in a hybrid, uh, sense in that you can be all flash, you can be all spinning media, some combination of both. You can have both in a cluster. You could consume some sand in cloud, uh, or bring your own device with something like a ONTAP select. Um, so there's lots of different ways that you can go about consuming this and, uh, workloads that you have and, uh, figuring out how it is that you want to place them. Uh, that's why you see the hybrid and the fact that you have a lot of strengths in terms of how it is you want to deploy your San, uh, San also tends to be quite flexible for pretty much the same sorts of reasons. Uh, you can deploy in a variety of different ways on prem, in cloud, in a hosted data center, in a shared space where you're just paying for the electricity and some number of racks, but you own the equipment to, uh, hosted facility where someone else is managing everything about it and basically is, uh, charging you rent that covers, uh, their administration of equipment. So, uh, you can consume these in all of these different ways. It is a big enough market and an old enough market that the number of offerings from different vendors around hosting San, or hosting your workloads or hosting your hardware, all of those things are possible because this is a very large and a fairly mature market. So I wanted to talk about this slide. And the very first highlighted item is AI in San management. If you're looking at AI and thinking San and followed by a question mark like what? Uh, San is not a particularly big player, uh, in terms of hosting, storage, supporting AI functionality. Uh, that market is very heavily dominated by NFS. Uh, you know, in the form of 90 something percent of all workloads are going to be NFS, uh, supporting, I configuration of some sort. What I'm talking about with I in sand management, the key word there is management. This is deploying AI in order to help you manage sand. Uh, in order for you to determine thresholds or aberrant behavior or to help analyze and, uh, localize what the nature of, you know, where the problem is happening and therefore, you know, more quickly determine what the problem is and so on. AI tools also can help you, uh, with not only performance insights. That is to say, you're trying to find where you're all of a sudden seeing some latency, uh, being able to pinpoint exactly, you know, it is this device and then look back at what has happened with this device before and after seeing that latency. Uh, I will help you, uh, significantly with that. You could do all of that without AI, but you could do it a lot faster and more efficiently with it. I also is helpful for informed decision making about your configuration, whether you need to change anything, whether or not it is going to support this new workload, whether or not this particular, uh, cluster or node is a good landing place for this new workload, those sorts of things, all are going to have some AI decision support related to them. Uh, lastly, but definitely not least is,that with any sort of it? Uh, you definitely have some security concerns. The number and types of cyber threats that are out there, uh, keeps increasing. Uh, there is lots of innovation in that space. You know, back in the day, it used to be that you were concerned about do you have a virus or not? You know, and you would use McAfee or Norton or something like that to basically scan for signatures for that virus. At this point, that feels quite quaint, because now you have lots of other different threat vectors that also fall in the common bucket of cyber threats. Those exploits are occurring faster, they're multiplying, and how quickly you're able to see and defend against them. And also recognize when you are under attack, you know, the amount of time that you have for that keeps shrinking to the point that without accessing, uh, purpose built tools and probably AI, it is going to be very difficult for you to keep up. A lot of it also is going to have to do with what your vendor actually is doing, and you know, what resources and R&D they're using in terms of securing their product and the data the product hosts. Posts. Uh, that falls in the next category, which is, uh, security measures. Uh, just like if you have a windows laptop, let's say it's personal laptop. You would be crazy to not have some sort of antivirus on that. If you don't, it is not when or if you are going to be attacked with a virus. Well, itis when it is not. If you're going to be attacked by a virus and, you know, we've seen studies where, you know, ten minutes after, uh, connecting a windows machine, a brand new one to the internet, uh, starting to get virus attacks against it. And so there are a lot of security measures that you need to put in place. And the quaint old days, it was as simple as making sure that you had antivirus installed and that it was up to date with threat signatures and so on. Now there is a lot more to it. Not only do you need to scan for Antiviruses. You also need to be able to recognize when you are directly under a TAC. Uh, that is with regard to viruses. Could also be because you're currently, uh, having a denial of service attack or other attacks of that nature. And uh, recently what's been in the news in a big way has been ransomware. And so you're starting to see, uh, anti-ransomware on servers and so forth. You're also starting to see that on, uh, some San and NAS equipment from various vendors, including us, uh, at NetApp. It is a combination of those tools, a fairly aggressive, uh, security posture and some proactive ability to recognize, not simply that I have been attacked and this is what I need to remediate or how do I, you know, clean off this virus. But how do I see that an attack is in progress? And how do I shut that attack down before it infects,or before it infects more of my estate? And so you the way you do that is,with lots of different layers. Uh, the good news is,that, uh, NetApp is absolutely the security favorite. Uh, and that is a distinct advantage to, uh, ONTAP San, uh, and our security posture. We are definitely a favorite of, uh, certain three letter agencies of the federal government that tend to be rather shy, uh, with their data. Uh, definitely do not want it to be attacked, nor do they want it shared, uh, outside of, uh, their four walls. Uh, so being a favorite of, uh, people who security is their top concern, uh, is, uh, pretty good. Yeah. Thanks, Michael. From here, we'll pivot a little bit into some of the challenges that we hear about from customers. Um, when it comes to sand management okay. So, you know, we can talk about the pain. But let's be honest, I believe that, as Michael kind of alluded to, this technology was around for a long time. People kind of figured it out. And there is a lot of automation. I think the environments are pretty stable in most places. I think they are performant. I think they are automated. I think they are kind of ready to go, and they're serving the most critical workloads, as Michael kind of explained to us in the planet. So why whatare what are the challenges that we do here? Right. So what we hear from a lot of customers and I think we kind of touched that is changes do happen, right. Do happen because I need to upgrade theswitches. I need to upgrade the host I need to upgrade the storage. So changes do happen. A lot of organization know how to manage sin. But you know but the challenge is,really into um, not so much in the data complexity because we know how to serve data is serving operational complexity. So we you know, we customers are asking us to use less toil to manage less screens, less consoles with there is a dashboard fatigue, alert fatigue. And the reality is that there is limited amount of SMEs or subject matter experts in the market. If you are as an engineer, it's great. It's good knowledge to keep. But because the reality is a lot of us becoming to be more and more rusty, because there is not a lot of operational narration that needs to happen day in and day out. And there's no kids graduating college today that say, hey, I want to be a Santa. I mean, it kind of evolving throughout projects, throughout things that you need to do. So when I'm talking about all of those challenges and you're talking about the need to improve performance and capacity. If you look at what the analysts say, you look at what a lot of our customers say, and a lot of the people say there is this fear.even if I'm doing it for 20 years, there's a fear to make radical changes in the environment. New switches, new frameworks, new zoning, new masking, new capabilities, new scripting as framework. Get involved. There is a fear of change. It's day to day life as well. That change is with sand is really,important because I don't manage just the storage. I manage all of the interdependencies the fabric, the host, the switches, the hbas that are associated with that. So and as we said earlier, change needs to happen. I'm upgrading. I'm evolving. I need to remediate the fiber and cybersecurity issues from being exposed. So you know, hardware breaks because of tear and happens. Software usually breaks because someone did something. Workload change, config change. So the idea there, that is what we're hearing from a lot of analysts and the analysts that we have done ourselves is 85% of the issues happening because someone changed something. And that configuration change caused the downtime, caused an HBA to be or a zone to be missing or a path to be not no longer redundant. So that'sone of the reasons we build these observability capabilities that we're going to talk about. If we go to the next one that's kind of analysts are telling us, they're telling us that San is here, San is to stay. So ask the question, is San dead? What analysts are saying Stan is here to stay until 2035. So good that we are building what we are building, and we kind of. We're kind of sharing a lot of those, right? I the ability to manage um fabrics using I the ability to understand the complexity and simplify the management. So what we're hearing from a lot of our analysts is,that good that we are talking about how we innovating in the space of not just sand, but also all of the interdependencies that are associated with managing that complex fabric that sometimes has hundreds and thousands relationships between because everything needs to be redundant. So you may have dual zone, triple zone. No one has triple but quadruple zone kind of making sure that everything is operational, everything is performant and everything is,running. And then what weare trying to do and what we are inspiring to do is to take all of that complexity andmake it simple. So if I make it very,easy and now I don't need to be an expert to kind of figure out what's going on and being jumping up and down on every single ticket or every single provisioning that is happening. I take someone, as we discussed a minute ago, that is kind of just finished, you know, college and we can take them and we can put them in front of the screen and they start to trace the clues. They have a process or a workflow that they can follow to figure out what's going on and simplify that management. And that needs to happen. So I can be an IT generalist. And we see the motion in the market to being more IT generalist about everything that we are doing. So that's kind of how we're thinking about it. And we're going to go a little bit deeper in this, in this in today's um, cast and more into,those areas and how we achieve that. Yeah. Ronnie, you actually hit thenail on the head, uh, when you talked about San, uh, expertise is becoming, uh, more scarce. The problem is,that the. really experienced sand admins are beginning to retire. They're beginning to age out of the market, and the people that are being hired, you know, into it roles generally, uh, they may get some responsibility for sand, but they do not have all of the deep knowledge that the suspender and belt wearing, uh, sand guy had before he, uh, took retirement. And so you might be able to do certain common tasks, you know, create a lawn, for instance, or, uh, protect it with replication, those sort of things. However, you may not be nearly as comfortable with zoning something new or even what zoning is in Fibre Channel San. Probably 65% of the problems that we see are there's a problem with zoning of some sort. You don't have a common zone or there is something misconfigured about the zone, or you're using multiple initiators in the zone. You know, there's a variety of different things. Or you've identified the wrong endpoints. But that is the biggest single common problem area in Fibre Channel San and Fibre Channel, by the way, is around close to 70% of San uh also by the way, sometimes called block storage. So if you encounter that about 70% of the time what you're encountering is Fibre Channel. The rest would be iSCSI, uh, at about 20 ish percent. Uh, you've got some NVMe, believe it or not, is still a little bit of Fcoe left. But the NVMe and Fcoe are, uh, rounding errors. The main place you see Fcoe isthere a Cisco UCS there somewhere? Uh, because Cisco uses it. I agree with that statement. And our job here at NetApp is to make sure that these block storage is accessible to everyone if you're a generalist or you're your specialized and making sure that we are able toaccess that. Go ahead. Casey. Yeah. All right. Well, with that, let's talk a little bit about block storage from NetApp. Let's walk through, um, just kind of a high level, Michael. I'll pass it back over to you. Okay, sure. So one of the primary initiatives that we've had, uh, with our San, uh, roadmaps, and this is going back, uh, probably the better part of a decade at least, is this is that we have looked very closely at how do we simplify, uh,management of Sands, uh, and, uh, the peripheral or the adjacent structures that are required in order to support that San. So one of the areas there isthat we're looking for what sorts of things are we pretty much going to make the same decision 90 plus percent of the time? So the vast majority of use cases, the answer is X. If that's the case, is this a decision that we could just have ONTAP make? Instead of asking the customer, you know, what physical port do you want to put this lift on? Iswe could make it. We already know what the best port is for it, and so why bother asking the customer who probably knows. But every now and then maybe fat fingers something and now you have a misconfiguration. Whereas if we just do it based on what we know. Uh, you are going to pretty much have a cookie cutter every time we encounter this. We're going to do this pretty much the same way. And so you will expect the same sorts of configurations, uh, going forward, uh, visibility is all about seeing you have your various San objects that are related to one another, and some are consumed by others in order to have a San product, the ability to see what each of those is doing and see how each of those is being consumed is critical. Uh, that is going to give you a lot of visibility into a is a configuration working? B is it performant? Uh c what are areas within it that I could do something about to increase performance or reduce latency? And simplifying management tasks is very much along the same lines as simplifying or streamlining operations. Generally, the simpler I can make it, the more decisions I can take automatically, the easier it is for you to manage. Uh, our product, and also the more consistent configurations are going to be, the less likelihood there is of having an incorrect, uh, decision impacting, uh, a configuration that has to be changed later. So you're not injecting additional errors in the process of doing that. So, uh, this slide is basically saying complexity is a silent killer. And that's true across not simply it all sorts of things. The more complex you make something, the more opportunities and thelarger number of places that thing can break and in some cases break in a compound way. It is also and we've all seen this before where you're troubleshooting some sort of issue and it turns out it is not an issue. It is multiple issues. In some cases, they're not even related to one another. You're just seeing this component broken in this way and also possibly in this way. And you have to recognize that one is not causing the other. And how do I go about fixing both as quickly as possible without either creating additional issues or making the other, uh, a bigger problem than it already was? In this slide, you're seeing a variety of different application Environments, high performance file. Virtual machines. Databases. Kubernetes, etc.. Um, for San, uh, we don't play in high performance file. That is all about NAS. So NFS and SMB, uh, virtual machines, about 70% of ONTAP San and probably similar percentages of anyone else's San, uh, are going to be servicing virtualized workloads. Now, that may be VMware, and VMware is definitely the £800 gorilla, but there are lots of other virtualization options as well. Hyper-V, KVM, uh, Xen. So there's a variety of them. And the more that you actually are deploying, the more complex that particular area is. If you employ only VMware ESX 8.3, then you at least only you don't have the complexity of various different solutions that do things differently. So youlimit the what you have to troubleshoot for that solution. However, the downside of that is you potentially give up features. You definitely give up negotiating, uh, headroom in terms of when it comes time to buy additional licenses, upgrade and so forth. Databases is another very big area for sand. Most common, uh, would be Oracle, but there are other databases as well. Uh, SAP is, uh, very big in the NetApp portfolio. And then there's a variety of other, uh, NoSQL databases, Microsoft SQL server, as well as a lot of custom built stuff. We're seeing Kubernetes and it is becoming a bigger and bigger player. Kubernetes is another way that you can go about virtualizing by creating containers. And so having all of the workflow exist within that container, that gives you a lot of flexibility and portability Ability for those sorts of workloads. It also makes it very easy for you to automate the provisioning and management of that container. So that is getting more and more popular. You have AI and ML and just data lakes, generally speaking. Uh, these are typically seen more on the side. Just uh, I support is typically the most common way to do that is going to be NFS. There is a little bit in block. It is growing, but right now it is very much a bit player. Uh, you also have cloud storage. I bet all of you have probably heard of some cloud hyperscalers. Uh, there's reason for that. That's because the growth has been explosive. And then of course, you have dev test, which you can do much more easily and much more accurately, since you can do things like clone the production workload and then host that somewhere other than in production. So, uh, for instance, you could have a production workload that is on site in your data center, perhaps using Fibre Channel. You could create a dev test clone that perhaps you're using FSx, uh, or Azure NetApp blocks or something like that to host that dev test. And so they are absolutely fire gapped from one another, but you're working on exactly the same thing that is in production. And so you don't have simulation types of errors. Another thing that we are trying and not trying succeeding isthat we're increasingperformance that you are seeing. We're increasing it by, uh, bringing better, bigger hardware to bear, uh, more CPU coils, higher speed CPUs, more memory. So the,basic building blocks of a storage controller. So that is a piece of it. In addition, uh, we We've refactor ONTAP code fairly continuously between different releases in order to drive additional efficiency, uh, and change the algorithm so that it is more efficient and responsive. So those are a couple of areas. And then of course, there's all the things that you're, you know, at the end of the day, you're not buying a San or a NAS. What you're buying is,storage that will support a given application. The whole thing driving the entire experience is,that you have an application you want to support. How do I do that? Do It with a NAS or a San? And what sort of characteristics of that in features, uh, are going to make that, uh, hosting better, more efficient, more user friendly, etc. some of that is going to be as simple as choosing a cloud provider that you want to work with. Uh, you know, some common ones would be Azure and AWS and Google as well as, uh, Somewhat smaller or more specialized, like IBM cloud. And so, you know, where do you want to place your workloads? How do you want to store them and how do you want to serve them, are going to all be questions that you're going to answer in order to figure out how to do this as efficiently and with the least amount of effort possible, while also having, you know, optimizing performance, something that ONTAP San has absolutely been a leader in is offering ransomware protection, uh, with the, uh, existing ASA, uh, we offer, uh, the ability to create a vault on a remote storage so that if you copied anything to that vault, you cannot change it until the compliance clock on that vault expires. Up until then, you can read it as much as you want, but you can't write or modify in any way. So that was the initial ransomware protection We are going to be augmenting that with the ability to detect that you are currently under attack, and allow that to then trigger a variety of actions in order to stop or reduce any impact of that attack. Let's jump into how folks are managing and how our data Infrastructure Insights product can help. Yeah. So let's talk about I think the first thing was unified monitoring for holistic of San overview. So holistic means a lot of different things. So let's talk about what it means to me. To me it means the ability to use machine learning to be able to build that topology that we see there on the chart, the ability to do it across NetApp and across other vendors as well. We understand, as Michael kind of alluded to, that we're not living alone in a silo in a bubble within San once you have the topology. So think about it. We have the inventory. We have all of the metrics from the performance, and I have a 3D dimensional view like we're showing there. Andthen I can start correlating items. I can start understanding who does what, where, who and answer all of those different questions. And I can identify zoning, pathing and make sure that everything is compliant, everything is running and everything is performing. At that point, I can start understanding what changed inmy environment, the VM part on the VM part of and as we said earlier, right, 85% of the issues are because someone changed something. So we can track those changes across the heterogeneous holistic environment. And then I can start putting AI on top of that. Right. That's something that we, you know, all of you asking all of the polls. It was second in the list. How do I take AI. And now applying into all my fiber channel, into my storage management. So I start looking and I will be honest with you, I is very noisy. I can set up I to be super sensitive because machine learning and AI, they understand patterns, right? So I understand pattern. I understand one, two, three. So I know next thing is four and I allow it to talk to Jane. I allow it to talk to human language. I allow to understand patterns so you can make it very noisy. So what we have done at NetApp, we basically started to apply seasonality because we mostly supporting it workload or workload that has some sort of a type of seasonality. It can be hourly seasonality, it can be daily seasonality, it can be weekly monthly. We allow to control those seasonality. And then we start managing the environment with those. So example going to be specifically on San. If I need to manage the connectivity between a host and a storage or connectivity between twodifferent fabrics, I need to look at the fabric buffers that I need to monitor. What threshold do I put in? million. 5 million? I don't know. But with machine learning, I start to monitor things that I don't really understand and I don't really know on how. And what do I monitor? We have a customer. And Michael, I will steal your thunder there. We have a joint customer, Michael and I, and what they are doing, they are monitoring the power consumption, the amount of power draw from the HBA card so they can understand that the HBA is going bad. Because once you start being flaky on the amount of power that it's drawing, they have an early indicator thatHBA is going to go south. And cooling is another metric that we don't know how to monitor and put thresholds. So that allows us machine learning and looking at seasonality that is perfectly meeting infrastructure needs is allowing us to do that. So Casey, if it's okay, can I share my screen and demo? So what are we seeing here? We're seeing here? I got an alert, and this alert was coming into us from an application. And over here, I can sort on whatever application I want that I have in my environment. Once I zoom into that application, I'm only going to get the answer that I need to monitor, and I need tokind of see. I can do it based on the host, the fabrics, everything that I'm monitoring. I can then filter and in real time our machine learning going to build that topology. I can now look at performance. So the line is going to shift based on the different devices that I have. So this is a very simplistic view. In most customers you're going to see more VMs associated withhost. Those are the fabrics and ones I hover over. We talked about streamline operations before and that was the ask. So I can basically see everything that I need to see on the fabric. This is my storage node and this is the underlying disk that is connected all the way up. So if I zoom in to that VM and assuming I'm not that sophisticated in sand management, that's not my environment, it's environment of someone else. I want to be able to basically see what's going on. So what do I see? I see here high latency. Oh my god, oh my god. Freaking out. I have no idea what to do because I don't see spike in IO. So what the hell is going on here. Like. And that's typical right? I see high latency. I don't see any shift in IO. Now we start finger pointing. I'm saying to the VM team, they're saying to the application, everyone is starting to blame each other. The drama starts mean time to innocent. I was expert in it. I ran a team that our job was to put presentations out there. They were amazing to prove that it's not our fault and that's very typical organization. Everyone is like shaking. No one takes take ownership, no one takes responsibility. And you know what? Storage team is always to blame. Why? Because we are the nicest people on the planet. That's the only reason they're coming to us. No one goes to the Networking and blames them. Why? Because they're not as nice as the storage team. So over here immediately kind of tells us what's going on. There is a workload contingent. There is a bully victim that basically bullying everyone else. So let's see what actually happened. So over here we see there was a change. So I can click on that change and I can zoom into that change. And I can see that this VM store DB backup was not running and changed the statute or the status on the vCenter to part on. So I'm now tracking what vCenter saw as the change on that VM. So now it's power on. Let's go back to the performance view. So it was powered on and now there was a triggered alert. The alert says abnormally meaning I was catching that there is an anomaly in the latency of that VM, something we never saw before. We track that to months before and we never saw such a high latency on that VM. It's the first time we see this latency, and we see here with our human eye, we see this is what we see. We see pardon. And all of a sudden in that chart you see here throughput and IOPs started. It was not there before because what powered off it didn't report that. So once it was powered on let me zoom out. Second after that there was an alert that triggered that. Let's take a look at the storage. No, I don't need to log into the device. I can see that the alert of the latency triggered because of a quality of service that was breached, there was quality of service that did not allow me to go above the 2000 IO here. So I was limited by quality of service. So immediately think about how much information I have. I have the fabric view. Those are the ports. This is the V center. This is the storage console. I see everything without leaving the screen. Why this is important? I don't have context switching, I just operating and I see everything happening at the same time. And all of a sudden those alerts disappear. What happened there was the VM was migrated, the data store was changed. So someone moved it from tier three data store to a tier one data store, where quality of service is higher. And again, I did not leave the screen to see what's going on. I was able to fix the issue, and I was able to tell the whole team that was sitting on the call since last night, what's happening? What do they do, how they should do, how they're performing? And I'm monitoring all of that without getting a single access to a single device. Back to you, Casey. Thanks, Ronnie. Perfect. Well, let's start to wrap things up here. Appreciate all of this. Uh, Ronnie, I just want to talk us through kind of a real world scenario where we saw customer benefiting from our block storage and data infrastructure insights. Yes. Michael and I were brought into a customer. This is a very large, maybe a little bit abnormal, but very large customer. Um, Michael and I know that this customer, we know the customer very well, and the customer has great expertise in sin. They know sin. Theyit's a one exabyte, um, capacity. They're 100,000 fiber channel port across 1200 fiber channel switches. But what happens every periodically? What's happening is this customer had an issue. There was a backup that was kicking in. And their database that runs a lot of really critical analysis was having a performance issue impacting the environment. Hmm. Interesting. There was a support call with NetApp. That's how kind of how we got involved. There was support call with brocade. That's how they got involved. There was many support calls, and we're sitting there and we're trying to figure out what we decided was this customer had the eye, we rolled it out to this environment. And what we have seen is that there was a mismatch between the two eyes or the two dark fibers that they had between sites happen. They fixed that configuration and they said, we think we already fixed that. They looked at the change management and they fixed it already. But, you know, maybe someone didn't do their job and didn't fix it correctly. The distance was wrong. So they fixed it and we thought everything was fine. The batch was working. Everything was working. Yes, they paused the backup, but everything worked. A few weeks later, all of a sudden all of the drama starts again. What the hell? What? The hand. They're calling everyone. All hands on deck. Everyone sits in the room. Everyone is yelling. Everyone is screaming. What?the hell is going on? I cannot curse here. So they're sitting there and they're trying to figure out. They log into Dia. And what we saw was once there was a high utilization on that dark fiber port. What's happening? It flipped to the default configuration. How would I know that? So we are able to track the change. We're able to show the change. Then they call theblockade. And we found a firmware bug actually a firmware bug now. So they were able to fix it. The environment is running. Butthe root of that thing change. Even hardware issues can cause software to be changed without change management, without approving, without anything. Understanding this across so many fiber channel ports, so many hosts, so much capacity ismore than finding than. I think it's pure luck to be able to find such an incident without a proper tooling, without the proper visibility, without the proper capabilities. It's so difficult. And that's a customer that has a lot of expertise in sin. I think smaller customers that do not, you know, may not have so many, um,super, you know, stars withsand. It's going to be much,harder. So the,outcome of this, we do need tools and proper processes and standard operating procedures to be able to kind of being able to successfully manage the environment, to get the clarity and the control over what we are doing, gain that visibility. But the visibility is not enough. Line on the chart is not enough. The INSIGHT iswhat we need. The root cause is what we need, and we took the things that we have developed for the Kubernetes and the Cloud at NetApp. And, you know, we shifted, we were very focused on the cloud and we bring it back, those new innovations and new things. On how you managing the basic and the most critical workloads in the industry. And we took all of that innovation and we are applying on a day to day so you guys can enjoy it. You candeploy it and you all can be heroes. As you adopt more and more the NetApp technologies out there. Okay. Thank you. Thanks, Ronnie. Um, I'm going to wrap things up here so we can hopefully get to any final questions that come through, because I know we're bumping up on time. I just want folks to know if you're interested in learning more about any of this. We've got some great resources. Uh, we'll put these into the chat for you. So we'll put links in for you, and these will come out in our follow up email after the webinar. But if you would like to request a personalized demo of data infrastructure insights or dig a bit deeper into everything Ronnie was highlighting for more simplified management and clarity, uh, you can visit NetApp. And then we also have a great decision brief that Gigaom put together that speaks to DII and our block storage and how they work together. And storage review also did a great technical article and really deep dive into how everything's working together. And we also have test drive capabilities where you can come in and do a hands on lab of our ASA E-Series block product. So all great things that are available to you all. Ronny one, I wanted to ask you in particular, since it's DII specific, was does DII actually make any physical changes to the customer's environment? Great question. Um, so we are out of band. We are read only and we are agentless. So the deployment is pretty easy. There is no changes, but if we have a lot of customers and what they are doing, they are driving automations. So we have things like time to full. We have a little bit of recommendations on workload placement. These can be extracted from DII eye and drive more intelligent into automation workflows and stuff like that. But now the deployment piece isextremely easy. It's read only, it's non-disruptive, it's metadata. We can work with any security team to get it deployed. We got security past people that are, you know, um, do not have internet access. So that's fairly easy to get it up and running. And we have done it with the most sophisticated, most secure customers in the planet. So happy to work with any one of you to get it up and running. It's also worth pointing out in your sand environment. For instance, you're going to have servers, uh, from various vendors. They're going to be running operating systems of various sorts uh, Linux, windows, ESX, uh, Solaris, EKS, you name it. Uh, those are going to be connected by uh, at least Ethernet switches and frequently fiber channel switches as well. And one of the things that you get with DII is this is DII is looking at all of that. If you are not looking at all of that, then you're basically looking at 20% of the data and hoping the answer to whatever your issue or question is,in that 20%. It might not be. What DII does is,look at all of that, look for anomalies, and also reduce the scope of the problem so that you can focus on the things that are important instead of looking at acres worth of wheat, all to find the yellow piece of, you know, the yellow needle in that. Uh, basically DII can basically, you know, can say, uh, okay, focus on this, uh, cubic inch of wheat because that's where the needle is. And so it very rapidly accelerates your ability to determine what the problem is, scope where the problem is, and then figure out what is the problem and start working on okay, how do I fix this? And then let's wrap up one more question. Um, speaking of anomaly detection andMichael, you just alluded to this too. Can you just give us a little bit more clarity on how that anomaly detection feature that you were showing, uh, is differing from kind of typical threshold monitoring? Yeah. So anomaly detection, the way that it's AI based right. So threshold it's really hard to set it up because threshold you know what is a good latency. As an example latency iseverything above 20 millisecond. Latency is good right. What we're hearing from a lot of customers now if they're okay with 20 millisecond latency, I'm okay with 20 millisecond latency by the standard, by the book. It's not okay, but it could be like a sequential workload. It could be that the workload kind of that's kind of the expected norm, right? So we don'tfix what ain't broken. Right. So the ability for us is to put dynamic threshold child is but the key here. But I gave few examples. Right. How do you monitor cooling? I would assume that cooling over the weekend, if the data center is a little bit cooler, it's better. But the machine doesn't know. That machine doesn't know that it's expected to be 100 degrees over the weekdays and be, you know, 70 or 60 degrees over the week. Machine doesn't know that. So the machine learning and anomaly detection allows to have a dynamic threshold based on the history. We can now start predicting the future. So that allows us to monitor things that we don't know to put the,right threshold on. But there are things that require static thresholds. It doesn't mean that AI going to replace everything. If I'm under regulated environment and I have an application maybe SAP on prem that is mandated SAP Hana for a specific KPI because it's a validated application Oracle database that has a mandate for a specific threshold. Otherwise, the vendor doesn't support it. It's then you can use static the flexibility of using dynamic static. With that we can start ingesting logs and more information into the platform. And the goal of all of this is to simplify themanagement do lost less toil and focus on outcomes. That's where we're going to win, you and us together. That's what kind of we're working with customers whenthey set those dynamic or thresholds in their environments. Okay. Thanks so much. Awesome. All right. Well with that I think we will wrap things up. Thank you again to everybody for joining us today. Um, really appreciate you coming in and learning a little bit more about our NetApp block storage and data infrastructure insights capabilities. So thank you all.
As your IT infrastructure evolves, navigating potential roadblocks and risks shouldn't impede progress but rather present opportunities for modernization. Discover how to overcome challenges by gaining comprehensive visibility and insights within you