Semion Mazor: Alright! Hi, everyone! Thank you for joining. We see the people still enter, still entering the session, so we'll let it another minute to everybody to join the zoom, and we'll kick off our great session today. Semion Mazor: We have a really great story to tell you about the customers that Semion Mazor: leveraging Amazon Fs. 6 for netap on top Semion Mazor: for his SQL. Databases. Semion Mazor: So just a couple of more moments to let everybody who wishesto join us today, and we'll take it off Semion Mazor: alright. So Hi, everyone. Thank you. Thank you for joining us today. Semion Mazor: and Semion Mazor: we have a great story to tell you about a customer that runs his SQL. Databases on aws and leveraging Amazon fsx for Netapp on top for his deployment. Semion Mazor: And but 1st let's do a quick housekeeping before we go into the actual content. My name is Samuel Mazzor. I'm a product enablement specialist in Etap, and I will be the host for today, together with me is also Rodrigo Gazanel, which is known as Gaza. Hi, Gaza! Can you please introduce yourself. Rodrigo Gazzaneo: Hey, Hi, Samion! Hi, everyone good morning. Good afternoon, whatever you are. Rodrigo Gazzaneo: Yeah. So I'm with Amazon storage. As a specialist I am a go to market specialist aligned with the financial services. Rodrigo Gazzaneo: been in Amazon for 3 years in some different roles covering different territories, but focused on storage solutions in the cloud. Before that I have a career in both United States and Latin America, focused on data storage solutions as well, pushed a lot of boxes across data centers ran cables. Rodrigo Gazzaneo: Right? So I hope I can relate with you in those adventures. Happy to be here, happy to help share this great story from our customer that I will introduce next, and, you know, welcome to. I hope you're welcome and comfortable in asking questions and making this very productive for you. Semion Mazor: Thank you, Gaza, and our special guest today is Speak Speaker Tim, how?are you today? Tim Gitchel: Doing good. Thanks. Samir. Yeah, I'm Tim Getchell. I've been in technology for about 30 years. Various roles from infrastructure to development to project management. Tim Gitchel: Right now I'm a fractional CTO for a couple of startups and also doing consulting. Tim Gitchel: All of this is in financial services sector. Tim Gitchel: So for this project I was a principal consultant for advisor engine. Semion Mazor: Thank you. And we're really glad to have you today to tell us the story that advisory agent had for the databases. And this session is streamlined, live, and also recorded, and will be available for you to see later on for all our attendees and the people who registered for this session. Semion Mazor: and if you have any questions, please use the Q&A panel and drop them there. We will be happy to address it during the session, or at the end we will have time for that. And also it can be on any level, on the business level, on the very,technicallevel, and for address your question. We also have here with us today, Pedro Fernandez. Pedro, would you like to introduce yourself. Pedro Fernandes: Yes, sure,So my name is Peter Fernandez. I'm cloud solutions architect here at Netapp, and I'm focused only in one service the Amazon Fsx for netapp on top. So today I will be answering your questions onthe chat. So please don't miss anything, and if you have some doubts or somequestions. Just use theQ&A box. Thank you. Semion Mazor: Cool. So what we have for today is a reallyquick intro just to align everyone. Make sure we all align on what is Amazon ephes for ontap. And what is the value proposition for databases? And then most of the session? We will have a team to tell us the story of advising agent, and what the challenges they face, and how they overcome them, and what was the outcome of that? Semion Mazor: And towards the end we will have a sneak peek for a new, very cool automation tool for SQL. Deployment on aws with Fsx ontap, which is called Blue Explored factory. Semion Mazor: So for those who are not familiar. Amazon, Fs. 6. For Netapp on top is a fully managed Aws storage, utilizing the Netapp on top as a native service within aws, it provides intelligent data infrastructure for any workload in particular for databases. It has a lot to offer. Rodrigo Gazzaneo: Let me make a brief comment here, Samuel, right? So here on the Amazon side, Amazon Fsx is our portfolio of fully managed storage, especially file solutions. Ilike to compare Amazon. Fsx. Right? For those of you that are familiar with Amazon, with Amazon. Rds, right? Where? What we do is that we turn something that is proven Rodrigo Gazzaneo: right, that is widely utilized, and we turn that into a piece of our portfolio through the Amazon console. So in the web console or through all your familiar automation tools that Amazon makes available, you're gonna be able to connect with a fully managed ontap experience, deploy volumes create aggregates. Rodrigo Gazzaneo: create file systems, use block file nfs, smb, solutions. Right? So the idea is Rodrigo Gazzaneo: the best of both worlds on tap. Rodrigo Gazzaneo: which you know and you appreciate, and you understand the capabilities and Rodrigo Gazzaneo: aws console for Rodrigo Gazzaneo: deep integration, ease of use and connectivity with Rodrigo Gazzaneo: other managed services like, compute and manage databases and manage analytics, that you may serve. Semion Mazor: Well, thank you for that. And there's a bunch of different workloads that Fsx front of can be utilized for today. We obviously are going to focus on database and Semion Mazor: what it provides for database. It starts with high performance to make sure that your database is getting the low latency that you need, at least from the storage perspective, and of course, on aws, you can do it from all other perspectives as well. Semion Mazor: Then it provides powerful protection to make sure you can meet your Rpo and Rto. No matter how strict they are. Semion Mazor: it also provides the ability to clone your databases to create a new database environments and perform database refresh quickly and with near 0 additional cost. We will explain and elaborate on all those. And of course it also helps you to reduce in cost by up to 50% with different technologies like thin cloning option, to lose fewer ec, 2 cores and different storage efficiency technologies. Semion Mazor: So what we're going todiscuss today is the story of Advisor Engine, which, as we will summarize it before we go into to delve into the full story. Semion Mazor: Their challenge they faced was to streamline and scale the performance of their workload, and then we looked for an option to optimize cost for over 1,000 SQL. Databases, and they also wanted to improve the Sas application, resilience and data protection. So they found Fsx fronttop, and the outcome is they improved. The database performance 7 times. They reduced the cost by 50% Semion Mazor: at the same time. Right? So improve performance 7 times, and we discussed by 50%. They enhance the database protection. Semion Mazor: and they have faster database cloning for the staging environment. And this is what we are going to talk about today. Most almost all our sessions are going to be to dedicate how this can be achieved, and team will take us to the details of that story. Semion Mazor: So without further ado, Tim, can you tell us a bit about advisor agent? Why, what they do? And what was the situation whenyou faced it? Tim Gitchel: Sure. Yeah, advisor engine is a wealth management company. Tim Gitchel: It has several different products that theyoffer. As you can see. There they have a Crm system, which is what we'll focus on today. They have portfolio management and Tim Gitchel: client onboarding tools, and they partner with a lot of different integration. Tim Gitchel: applications to provide various other services. But Tim Gitchel: their clients they have, you know, hundreds of advisory firms thatare managing half a trillion dollars plus and assets onthe system. Tim Gitchel: And the Tim Gitchel: problem that they were having specifically with their Crm databases. They had Tim Gitchel: a self managed Tim Gitchel: SQL. Server instance, actually 2 fci instances for production that had all these Tim Gitchel: a thousand roughly databases split between each Tim Gitchel: each cluster. And this is the original architecture. When I 1st came in. They had Tim Gitchel: nodes running in dual availability zones, and they had another region with an another Tim Gitchel: node for each cluster. Tim Gitchel: and Tim Gitchel: what they were running into is, they had performance issues. Tim Gitchel: They. During certain times of the day thelatencies got to be unacceptable. Tim Gitchel: They were having issues with the Tim Gitchel: replication of the data to the Dr. Side, it would often get out of sync, fall behind. They had todeal with a 3rd party Tim Gitchel: windows, sandless cluster application Tim Gitchel: vendor to be able to fix these issues. So they had a lot of Tim Gitchel: pain points, and they could see that as their business was growing. They're adding new firms that they were gonna have to scale up this solution. If they didn't Tim Gitchel: do anything to fix these issues, and the next step would be to like double Tim Gitchel: their infrastructure. I have to add Tim Gitchel: additional cluster which add additional complexity, or they'd have to double the size of the Ec. 2 instances, and the ebs, volumes and all the things that we're supporting this. So Tim Gitchel: that was kind of where they were at. AndI came in and did acomplete Tim Gitchel: discovery ofthe system, and put in some monitoring tools, tracking tools to keep kind of get a baseline Tim Gitchel: ran things like dB hammer to get transaction per minute counts onthe existing architecture. Andthen what set about to try to come up with a better architecture that would perform better andpossibly be more cost effective. Semion Mazor: All right. So in this situation of like growth on one hand, of the service and the application and the other hand challenge of the performance, and those 2 doesn't come together. And then Semion Mazor: you're trying to evaluate. So whatdo you see? What did you see from the performance? Aspect. Tim Gitchel: Yeah, when we dug into the performance issues. It became clear pretty quickly based off of the monitoring andthe testing that the file system was the main culprit behind theissues. It's obvious that it was a problem for the replication to the Dr. Region. Tim Gitchel: andthe stability wasn't all that great, but even just thedaily workloads in the production environment. Tim Gitchel: We're suffering because Tim Gitchel: all of the load that was needed to keep the file system in sync was Tim Gitchel: taxing the processing and disk channels and things on the primary nodes. Tim Gitchel: And they were fighting. SQL, sequel server engine for those resources. Tim Gitchel: Yeah. So here is a performance comparison that I did. Started in the their Qa environment, which was just a scaled down version of production. It was configured exactly the same. Tim Gitchel: This on the left is a transaction per minute chart, and, as you can see it's very peaky andinconsistent, and Tim Gitchel: as the resources are being pulled away to keep thefile system Tim Gitchel: in good standing, the SQL. Server responses weredropping, and Tim Gitchel: then, when I re architected it to the, to the news architecture we'll look at in a minute. The one in the middle is the same Tim Gitchel: infrastructure in terms of theEc. 2 Tim Gitchel: sizes and types. But you can see that by replacing the file system. Tim Gitchel: that the transactions per minute. What way up? And it's just very steady. There's not all these peaks andvalleys. Tim Gitchel: And then on the far right is the actual production system, which is again a scaled up. This larger Ec, 2 instances, slight, slightly higher throughput on the Fsx, for on tap. Tim Gitchel: and those are the kind of values that you can see. Semion Mazor: Okay, soSemion Mazor: yeah, go ahead. Rodrigo Gazzaneo: I had one question, Simeon. Just one quick comment here. So, of course, tremendous benefit of you know. Adopting Fsx. My question is. Rodrigo Gazzaneo: how did you come across? Fsx, you know? Did you have Rodrigo Gazzaneo: did the did advisor engine had any experience before, were you? What were the alternatives that you're looking at? Tim Gitchel: Yeah,that's good.question. Tim Gitchel: this was a little over 2 years ago. It was 2 years ago inAugust that this was put into place so a little bit for that when all this discovery was taking place and I Tim Gitchel: researched lots of different Tim Gitchel: options, and I came across awhite paper on the Aws site that specifically mentioned using Fsx in for a sequel server, Tim Gitchel: file system. And it looked really promising. So it was one of the things that I tried. I built several different proof of concepts. I tried theFsx for windows. I tried Tim Gitchel: getting thelatest version of thewindows based file system to see if it would fix the issues. But I tried several different things, but the Fsx. For on top wasclearly superior. Tim Gitchel: And it was the least expensive Tim Gitchel: at the same time. So that was kind of no brainer to make that choice. Rodrigo Gazzaneo: Interesting. Let let's take a look at the architecture, then. Or Samuel, do you have anything else to add? Semion Mazor: I think I just like it's very cool to say that you had no previous experience, like there's the myth of Semion Mazor: on top is for fsx on top is for those who knows on top as well. And in your case it wasn't. It wasn't like you weren't. Tim Gitchel: Yeah,Ihad heard ofNetapp, and I had, you know, like a good overall impression ofthe company. And but Idid not even know that there was a managed solution at the time on Aws, and I had no personal experience with Netapp at all. When I started this project, so it was very easy to pick up themanaged in instance. Tim Gitchel: helped a lot, and the way that they expose the ability to do your own customizations. Tim Gitchel: Also really helped in the final product. Semion Mazor: All right. Rodrigo Gazzaneo: Point. Semion Mazor: Sohow can you? Can you take lead us through the revised architecture, with Fsex on top. Tim Gitchel: Sure. Tim Gitchel: Okay, so Tim Gitchel: what I did is I kept Tim Gitchel: 2 nodes in the primary region different availability zones. And I upped it to 2 nodes in the Dr. Region. And as you see in this diagram, they'reinactive. Tim Gitchel: Because one of the things that's really key to this new architecture is that you'rereally separating the compute from the storage. So you get this kind of separation of concerns where you can really optimize your compute Tim Gitchel: layer and optimize your storage layer independently, and they work together really well, because I've got a multi az fsx, for on tap Tim Gitchel: cluster in both regions, and it's responsible for keeping all of the data in sync, for high availability and for snap mirror for the Dr. Protection in the other region, and the nodes don't have to worry about any of that. They they're just connecting to thecluster of eyes. Standard Iscsi interface. And I'm using Tim Gitchel: sci Tim Gitchel: on sequel, because Tim Gitchel: primarily because we have so many databases. Once you get into hundreds of databases per instance, if you don't use fci, you start running out of work with threads, and you get all kinds of issues. But that's also it's really a good solution, anyway, evenbesides that, because it's.very simple, it's well proven. It's it,works, it works great. So what we did is we set up just asimple, single aggregate single volume Tim Gitchel: for the data, another volume for the logs. All of the temptb and stuff is on just ephemeral storage on the Ec. 2 instances. So that's another thing is, since we were able to really tailor the Ec. 2 instances we wanted to use and weren't concerned about how much I/O bandwidth they had, orhow much I could focus on something that had a lot of ephemeral storage andTim Gitchel: the right amount of memory. Tim Gitchel: And so I could really fine tune all of that, and not worry about having to get stuck on a specific instance type that I didn't really want for those purposes. Tim Gitchel: So now you've got Tim Gitchel: the 2 active nodes for high availability, and then, if you fail over, there's scripting thatbrings up the nodes. Tim Gitchel: you know. Within a couple of minutes they come up, and they'll attach to the storage on the Dr. Side. Once it's been quiesced andset to read, write, and you can fail over Tim Gitchel: seamlessly to Dr. In about 4 or 5 min, andbe up and running with no more than 5 min ofdata loss. If you had a true Dr. Event. Tim Gitchel: and then, on the high availability side, it fails over in less than a minute to the different Az. Tim Gitchel: And of course theon tap Tim Gitchel: cluster has its own failover between the 2 availability zones. So if you completely actually lost an availability zone, everything would fail over. But if something just happened on the file system side, it could fail over independently. Something just happened on the compute side. It could fail over independently. So all that justworks. You don't have to worry about it because you get this real good separation. There. Semion Mazor: Right, but. Rodrigo Gazzaneo: No, Ijust I just find it. You know fantastic, right how Rodrigo Gazzaneo: how it is. You know that you're mentioning about simple, and you don't have to worry. I love. I love the you know how many times you say you don't have to worry, but if you, if you look behind the scenes. Right advisor engine is using the full breadth of the aws infrastructure underneath. Right? We have. Rodrigo Gazzaneo: in region high availability across Azs Rodrigo Gazzaneo: right both on primary and on secondary right. If you translate that to an on-premises world. Rodrigo Gazzaneo: if you're familiar with that, that's like having metro clusters right? Andthis is all built in Rodrigo Gazzaneo: with the service transparent to you. Right? And then you have the high availability Rodrigo Gazzaneo: architecture that you built on with SQL. Server leveraging this capability right? With which is the file failsafe cluster you use aws region again, you know. You go to another region to have a quorum, right? So itkeeps the entire architecture highly available Rodrigo Gazzaneo: and leveraging the power of these distributed resources infrastructure so hard to reflect that you know in an on premises. How many buildings, how many data center facilities would you need? Right? So that's another advantage of combining the power of Amazon and Fsx andNetapp. In my opinion. Rodrigo Gazzaneo: Ido have a question right and I think maybe it's related to the, to the timeframe that youstarted the solution. Rodrigo Gazzaneo: you mentioned that it started a couple of years ago, and of course, Amazon Fsx for Netapp on tap is only 3 years old. So it's kind of a young service. You built your Dr with. Rodrigo Gazzaneo: Multi az high availability as well. Right? We have in Amazon. Fsx. Now, the option to do this. If a single Az deployment, you have highly availability, but the replication happens within a single Az right? Rodrigo Gazzaneo: Would you take a look at this. If you were, for instance, cost constrained? Or were you under a requirement to have a Dr. That really had the Qa. Orequal capability in the as the production. Tim Gitchel: Yeah. So Tim Gitchel: it was definitely constrained by what was available at the time, of course, but Tim Gitchel: we could see that there was going to be such a high cost savings, and then a cost avoidance from not having to scale up. I mean multiple 6 figures a year ofsavings that I wasn't as concerned about Tim Gitchel: that specific issue because I really wanted. If we did failover, I wanted to failover to be on par with theprimary region. Andactually, they'replanning on doing afailover and stay type scenario where they on scheduled basis, they fail over and make it the other region primary for a period of time, and reverse theDr. To the other region. Tim Gitchel: So it's already well set up to handle that because it's identical. Rodrigo Gazzaneo: Tremendous. And I think it's also on point, on the. Rodrigo Gazzaneo: on, the on the cost advantage that you perceive right. It enabled you to build an even more resilient infrastructure right and improve and increase the service levels compared to what you had previously. Tim Gitchel: Yeah. And you mentioned earlier about, you know they don't. You don't have to worry about things that things are easy. Tim Gitchel: If you come in as a consultant. That's a really top of mind, because you don't want to create this complex house of cards that, you know, requires all this hand holding andworrying about things breaking or falling down. You want something that's really solid Tim Gitchel: that you know. You're not going to get called back, or they're not gonna have to spend a lot of money to maintain it. The original architecture was built by afull time, Dba that they had, and that was the other. One of the other reasons is thatDba left. So before this project started so and they still have yet to hire another full time, Dba, because the system is just so rock solid. They haven't really had toworry about it. Tim Gitchel: So that's a key factor. Tim Gitchel: Alright. Rodrigo Gazzaneo: Awesome. Semion Mazor: I wanted to ask you to touch upon the like. You showed this graph with the performance that mostly was impacted by the 3rd party replication. So can you touch upon the replication before and after. Tim Gitchel: Yeah. Tim Gitchel: the replication before the main difference from auser or consumer or like a devops. Perspective was thepain point with the Dr. Replication. That was the biggest difference. Tim Gitchel: That was clearly visible is because Tim Gitchel: theDr. Was set to. I think, 15 min something like that, and it would Tim Gitchel: routinely Tim Gitchel: just fail andgo down. Tim Gitchel: and then you would. Wouldn't have a Dr. Or it would get behind if they did. Some large batch updates to the day. They would, you know, stretch out and be behind 30 min or so. Tim Gitchel: but with theon tap system with the snap mirror. It's set to 5 min. So every 5 min there is aupdate to theDp. Volume on the downstream region, and it has never Tim Gitchel: gone down, and it's never gotten behind. So it'salways just been rocksolid not to worry about it anymore. We know that if there is a true Dr. Event, they're not losing any more than 5 min worth of data, and they can switch over and be up in just, you know. A few minutes. Rodrigo Gazzaneo: That's awesome. Rodrigo Gazzaneo: I think it leads the question to the 50% cost reduction. Rodrigo Gazzaneo: where?Rodrigo Gazzaneo: is this cost reduction coming from. Tim Gitchel: Yeah. The cost reduction. Isavery multifaceted thing. The biggest savings and cost ongoing isthe cost avoidance of having to scale up, which is not even included in that reduction. But when you look at. Tim Gitchel: you know originally they had to have Tim Gitchel: the Dr. Nodes running. Tim Gitchel: you know, 24, 7, because they were responsible for keeping the data on that side in sync. Tim Gitchel: Now, those are normally off. So they're not incurring charges. Tim Gitchel: the Tim Gitchel: ability to Tim Gitchel: scale the I/O Tim Gitchel: and have the Tim Gitchel: over provisioning of the resources. So you've got a relatively small aggregate that's serving, you know, 6, 7 times the amount ofdata that physically actually has. So there's just the cost compared to the Ebs volumes is significantly less. Tim Gitchel: They had some licensing things where thosesecondary nodes were actually licensed Ami's. So they were paying per CPU on those Dr. Boxes that were just sitting there replicating data. That's all they were doing. But they're paying full sequel enterprise license on all those. So that that's saved a lot. Tim Gitchel: So there's a lot of things that went into it. But Tim Gitchel: you know, on tap was a good portion of that juststraight up, and it certainly was a key foundation that allowed a lot of these other things that added onto that,let that added up to the 50% total. Rodrigo Gazzaneo: I appreciate it, team and I like theholistic view Rodrigo Gazzaneo: of that, you know, cost savings. It's Rodrigo Gazzaneo: I think sometimes, you know, we can look at apricing in the cloud because everything is metered right? Everything is provided as Rodrigo Gazzaneo: aservice tier, right? And you know theability to Rodrigo Gazzaneo: look at the aggregated performance. Rodrigo Gazzaneo: theability to look at the aggregated impact, to think about a solution is really what matters in thissituation over here? You,call out the space efficiency, and the database use case is not always an easy one, right? But because you have a complex environment with VR, Rodrigo Gazzaneo: right? And data protection. That'swhen you're able to realize. Rodrigo Gazzaneo: I see, I see on the picture there. you illustrate that you have a backup process as well. Can you describe a little about that backup process? How you implemented it? Tim Gitchel: Yeah the original system used native backups and went to a windows file server using the large Ebs volumes to store the backups. Tim Gitchel: So I switched them to a 3rd party backup solution called light speed. Tim Gitchel: and it Tim Gitchel: can back up the databases using its own proprietary compression and encryption, and it can stream directly to S. 3 buckets natively. So itmakes it a lot simpler, andit works reallywell, and it reduces the cost. Excuse me of Tim Gitchel: the backups storage significantly, because you can set up all the policies to after a certain amount of time. It.cleans up stuff and it moves things into glacier and things like that. So the storage cost of the backups, even though we increased the amount of retention went way down. Rodrigo Gazzaneo: Awesome. Tim Gitchel: Excuse me. So inthis one here, I mean, once we had all that set up and working, and really started realizing what we had in terms of the file system and some of the features thatwe weren't really utilizing. Tim Gitchel: It's like, well, hey, we can take and easily create a snapshot of thevolume Tim Gitchel: and Tim Gitchel: create Tim Gitchel: basically an ephemeral instance of the production server. Tim Gitchel: Just within a matter of 2 or 3 min. We can have a exact duplicate of everything just up and running. So it's like, okay, how can we leverage that? So the 1st thing that we did to leverage it was, you know. Tim Gitchel: Let's do backups offline instead of having to worry about the backup window, and what other jobs are running andwhat loads are happening there. We just Tim Gitchel: put the backups over in the Dr. Region. Tim Gitchel: And so when Tim Gitchel: the time for backups come, these Backup, Ec. 2 instances get started by an event bridge schedule that kicks off a lambda function that starts the instances, and then they have a bunch of scripting on them when they start up, that they self provision Tim Gitchel: thesnap clone of the Dp volume and make it read, write, connect to it by Isc. Do a few things to clean up because they're, you know, theser- server name is different and things like that. Tim Gitchel: But then, when they start up, it's justlike production was a few minutes ago, and it actually runs the backups over there so you could have 0 impact on the production side at all. Tim Gitchel: and but you can do all your backups. You can do all yourintegrity checks and all those things over there offline, and then when it's done. It automatically copies back over, like all the backup metadata to Msdb. And to the light speed repository. So it looks like Tim Gitchel: to the backup console like the backup has actually happened on production. But they happened overhere offline, and when it's done with that it cleans itself up and drops thesnap mirrors andshuts down. So those but those instances, those Ec. 2 instances, are only running during the backup Tim Gitchel: time window, and that's it. Rodrigo Gazzaneo: And Tim that there's a great aspect of the backup and the Dr. Architectures that you design, in my opinion, which is leveraging. You know thison demand. Nature of the cloud right when you go in and you take a backup that you just described, you only instantiate,the backup servers right for the time necessary, and then you shut them down. So you know yourability to reduce costs Rodrigo Gazzaneo: with this intelligent approach to compute is,a win in the cloud, right? Same as your Dr. Right. You were talking about that the moment that you make your er Rodrigo Gazzaneo: part of a storage workflow and no longer a compute workflow. Your Dr. Instances can be shut down Rodrigo Gazzaneo: right? So you don't need to have you know those instances turned on because they're you're not using their processing power to, you know. Keep getting the blocks and writing to disk. You're doing this at Fsx, andyou know. It translates to me Rodrigo Gazzaneo: as additional cost savings opportunities because you're leveraging theon demand. Nature of the cloud so intelligently. Tim Gitchel: Thank you. But yeah, it'sworked out really well. Tim Gitchel: And it also allows you to do things that you may not even be able to do before, like we can do a full validity check on every database. Every night Tim Gitchel: we can do a full Tim Gitchel: backup verification. So every single backup Tim Gitchel: is verified every single night. So it makes a little bit longer backup window. But since you're completely out of band from production, it doesn't matter. You can get those extra things that you maybe wouldn't realistically be able to even do without that kind of a pattern. Rodrigo Gazzaneo: Out of band on demand. Tim Gitchel: Yeah, exactly. Semion Mazor: I think what's also nice in this case is to see the different ways you can use Fsx on top, like, I think that this is not the traditional way, or the 1st way that we usually recommend to use, because Semion Mazor: Ifsx on top offers the application. Aware snapshot, which usually would be operated with Snap Center, and I think Pedro investing many of his time that is here over the chat to with to help customers like to operate it with Snap center, etc. But in this case you wanted to do it in a different way tomeet the goals, and you can do that as well. Semion Mazor: So it's also to see the different options you have both in operating it, and how you achieve the goals that you want. Tim Gitchel: Yeah, there's definitely lots of different options in terms of how you would configure a system like this. And obviously each has theirpros and cons. But yeah, we are considering moving toSnap Center snap, you know this again. It was over 2 years ago, and Snap center wasn't asrich and robust as it is a solution as is now didn't necessarily have theknowledge base then. So it was. It was kind of a Tim Gitchel: an easier decision to not use it at the time, but you know it does have a lot of advantages as well as you mentioned theapplication where snapshots make a big difference. Semion Mazor: Alright. Semion Mazor: Anything else that you Semion Mazor: we have to question about. Semion Mazor: back up in your aspects. Rodrigo Gazzaneo: I think there was a question here that Pedro and I were working on about the details on the backup architecture right? So it was, I think it was a little more on light speed, and you know we're welcome to continue to support the answers here on Q&A. Rodrigo Gazzaneo: But I think the point is that these architecture, right using light speed, was Rodrigo Gazzaneo: your choice an intelligent one, very efficient. It integrated with SQL. Server backup operations. So it enables the Dba to still look at from a catalog. There are many other data protection solutions available on Fsx, there is Snap Center that you mentioned team that you are considering right now. Rodrigo Gazzaneo: And tothe audience, you know, if you're curious about a particular data protection solution. You're welcome to continue to ask questions or extend the conversation. But I really appreciated thetopic coming up here super important when it comes to storage right goes hand in hand with backups, data, protection and disaster. Recovery. Tim Gitchel: Yeah. And light speed was actually used as themigration mechanism when we migrated from the old architecture to the new architecture. So back everything up to S. 3, using light speed on the old architecture, restore everything, using light speed to the new architecture, andcut over. That's that.was the mechanism. So it just it.helped with that.obviously would not Tim Gitchel: something you couldn't even do withSnap snapsing. So Tim Gitchel: that's another reason. Tim Gitchel: Yeah. Rodrigo Gazzaneo: And it does create one beautiful. You know very familiar term that we hear now the 3, 2, 1,kind of solution, right with thecopies, you know. 2 different medias 2 different sites. Right? So itdoes. It does execute well, on the recommended data protection architecture for all of us practitioners. Semion Mazor: So there's a couple of unique capabilities that are relevant forthat. We use what we discussed so far, for example, being a shared. Storage block shared block storage across availability zones. Semion Mazor: And another one that I mentioned in the beginning is the Think learning, utilizing flex clone. So can you tell us more about how you use this capability and how it help advisor engine. Tim Gitchel: Yeah, theflex clone. You know, once we did the backups. Tim Gitchel: It's like, well, howelse can we leverage this pattern, because, you know, being able to have Tim Gitchel: quickly, I got an exact duplicate of a production that is offline. And you can do what you want with this pretty powerful. It's like, well, how else can we leverage this? It's like, well, you know, let's create a staging environment that we can use for release testing. Andfrom a data Tim Gitchel: protection in terms of like, if somebody Tim Gitchel: on the client side deletes data they shouldn't have, we can bring up this staging environment within Tim Gitchel: just a few minutes, as of Tim Gitchel: whatever time we want and go in and get the data. So created this staging Tim Gitchel: Ec. 2 instances that work similar to the backup. They get kicked off Tim Gitchel: every early Monday morning, and then when they come up, they provision aclone a flex clone of Tim Gitchel: the volumes, and they come up as a copy of production. And then Tim Gitchel: the devops team and the developers can use that for whatever Tim Gitchel: you know, staging type workloads that they need to do. Tim Gitchel: Late Friday it shuts itself down and cleans everything up. Tim Gitchel: So it's not running over the weekend unless they can bring up on demand as well. But it's not running when it's not being used, so it saves money, and it refreshes the data everyweek. Tim Gitchel: and they can refresh it on demand as well. But if they do nothing, it's just going to continually every week have anew refresh copy ofproduction that they can do anything they want to and not worry about affecting production. Tim Gitchel: and it will always clean itself up so that you don't Tim Gitchel: have these long running snap mirror Tim Gitchel: block buffers that just grow and grow. Semion Mazor: How it worked with the original architecture. How was the process forthe staging environment before. Tim Gitchel: Yeah, the original. Originally, they just didn't have Tim Gitchel: that capability Tim Gitchel: short of Tim Gitchel: actually restoring just specific targeted databases Tim Gitchel: to another server which you know, was,usually running all the time, even if they weren't using it, and it took a long time to actually do the restores. And it just was not the same experience at all. So this is allows them to do a lot more than they were able to do in the past. Semion Mazor: You know, one of the things that like. It's as I mentioned, it's a new capability for ontop and like Semion Mazor: it's what'snice about it that there's no actual data movement just using the pointer because it's snapshot based. So it doesn't add a significant capacity and doesn't end significant for significant cost on one hand, and the other hand, it also shortens the development cycle. So it's nice to see also how it fits into this architecture and actually helps in the Semion Mazor: in the development process, in the Cacd pipeline. Tim Gitchel: Yeah. Andthe way that whenever you bring up theflex phone that it doesn't take any additional space until you actually start changing the data. Tim Gitchel: Isawesome, which is obviously why it's so fast. I mean it. It's literally seconds over, you know, a terabyte of data. And it's just Tim Gitchel: it's amazing. It works really,well. Rodrigo Gazzaneo: Oh, this is fantastic team! Do?you have? You know, just for the audience here. Do you have an idea about the Rodrigo Gazzaneo: the size of the databases that use nap frequently Rodrigo Gazzaneo: are they like a multi terabytes, range or. Tim Gitchel: The databases range drastically, because it's a database per tenant per client. Tim Gitchel: and some clients are pretty small and some arelarge, so Tim Gitchel: they vary up to. I think the largest ones are in the Tim Gitchel: 4 500 GB fora single database. But across all the databases. Each instance has alittle under 2 TB Tim Gitchel: of total databases. Tim Gitchel: and I don't. Tim Gitchel: Since there's so many databases, it's really difficult Tim Gitchel: unless you're really targeting a specific use case tolike Tim Gitchel: duplicate specific databases. That's why I just simply just take asnapshot of the whole volume which has all the databases for that particular cluster, and I don't worry about it. If I don't need the databases, it's fine. They don't take up any space. I'm not changing the data. It's just it doesn't matter. Tim Gitchel: So whatever databases I'm focused ona change in data there, it's adding a little bit of extra temporary block storage while that flex phone is active, andthen it goes away and cleans up so. Rodrigo Gazzaneo: Anyway. And it's true, you said that you don't have to worry about the databases, that you'renot effectively cloning, because that's exactly what happens with on tap. Right? If you're not changing specific block, right? It'sjust a pointer operation, right? So it's Rodrigo Gazzaneo: it's not just a consideration. It's actually the technical capability ofontap that makes it viable. Tim Gitchel: Yeah, andthebuilt in deduplication and compression and encryption. Tim Gitchel: Means that Tim Gitchel: the SQL. Server engine not doesn't have to worry about any of that. Tim Gitchel: Iyou can turn off, always encrypted. Tim Gitchel: Take that load off of the compute not, and off the concern of the SQL. Server itself. Tim Gitchel: Don't have to worry about theactual size of the aggregate. Let the file system worry about that withcloud alerts and things and automatic Tim Gitchel: growing as it needs to. But Tim Gitchel: as far as the sequel server is concerned, you know, it's got Tim Gitchel: 2 or 3 TB of extra space, andit's never gonna run out so you don't have to way over provision Tim Gitchel: that, and you don't have tomess with it. You just have to keep your aggregate scaled appropriately, which is easy to do through the alerts and autoscaling things. Semion Mazor: Tim, can you share a bit about the operational side of things, and not just for the staging and the cloning forthe whole thing like how you deployed Fsx on top! How you manage the process! How the team does it! Since then. Tim Gitchel: Originally, I just did it all Tim Gitchel: manually through the portal Tim Gitchel: and then Tim Gitchel: certain things that I needed todo that weren't Tim Gitchel: originally included inthe portal I connected by Ssh. To the directly to on tap. Tim Gitchel: and just use cli commands todo things, and Tim Gitchel: upon doing all that. It's kind of how I learned what I needed to do for the scripting to make all this stuff more automated. So that's what the scripting does is, it's actually a Powershell script that runs on theinstance when it comes up. Tim Gitchel: and it will ssh to theon tap Tim Gitchel: manager and run certain cli commands to like, you know, provision the snap mirrors and change theIg groups and things like thatneed to happen for it to be able to properly connect over Isc and all the cleanup and things that happen. So that was all really easy to do, because. Tim Gitchel: like I mentioned earlier. You know it's not just a closed, managed service. It's an open managed service thatTim Gitchel: a lot of the things all of the main things that you need to see and do are exposed through the portal. But anything else that may not be exposed to the portal you can still do, and you can get to just by going directly into this file system. Tim Gitchel: Cool. Semion Mazor: And kind of Semion Mazor: question to a question, would reason. But would it be useful for you if there was a tool that you can do it all like automatically. Tim Gitchel: Yeah,Tim Gitchel: that would be awesome, you know, ifwhat Tim Gitchel: what's available now is available, then it was certainly been a lot easier to get Tim Gitchel: towherewe are now. But Tim Gitchel: yeah, it would be been awesome just to be able to have a lot of these patterns already Tim Gitchel: there, and already defined in a tool that can orchestrate all that for you. I mean, we've done some ansible scripts and things like that to orchestrate some of the infrastructure stuff. Tim Gitchel: But having, you know, an outside managed service that can orchestrate. That would be awesome. Semion Mazor: Alright. So I would like to share with our audience that today there is such a tool like it wasn't available when Tim did the deployment. But there is such a tool. It's called Blue Explores Factory, and I would like to take the couple of next minute to demo it for you. And it's a free of charge. Semion Mazor: It's a freeof charge service available for you. Let me just Semion Mazor: change to the different screen. Yeah. Semion Mazor: And Semion Mazor: it's available on this link. console.waters.net.com, andyou can operate it without providing any credentials. This to start with. And then, as you trust grow. You can let water factory do things on your behalf. Semion Mazor: The purpose of this service is to make the operational of workloads on Fsx on tap automated and implement best practices automatically. So if you don't have team work doing it for you,can use workload factory and Semion Mazor: and there's a different workload that's supported by workload factor. We, of course, today will focus on databases. So it starts with deployment options. And we don't talk here specifically, just about the storage aspects, but on the whole, workload on the whole stack of the Semion Mazor: of the workflow, including the compute and the database itself. So there's a wizard to deploy SQL. Microsoft, SQL. Server. You can choose quick or advanced, create. Semion Mazor: go through the different parameters, including the region and Vpc. The application setting for the database itself, the connectivity. Semion Mazor: then theon top information and a summary. You can go to more details with the advanced create. And what's nice that if I provided the service with credential to my address account and click on create. It just will deploy it automatically, embedding all Semion Mazor: the relevant compute databases and storage and other option is just to use it from the code box which is automatically implementing the different parameters that I set in the Wizard. And then I just can copy and paste it into the existing workflows. Semion Mazor: So there's also an option, and you don't even need to provide any credential to use that. You can also save configuration to get back to it later. So this is for Semion Mazor: and to deploy thedatabase. Semion Mazor: Other things that I can do, and maybe as a 1st step is to explore savings. So I can Semion Mazor: open, and Semion Mazor: in the inventory that I have see different details of the existing deployments, information like Semion Mazor: the host that they have the distribution of them, and going to specific details of Semion Mazor: of the of each instance, and then I can open and go to explore savings it can be for specific deployment that they have. Semion Mazor: or for just manually enter parameters, and to see how much it would cost on Fsx on top, comparing to other storage services. Semion Mazor: and it will detail all the information within it. Within this deployment, all the details, all the breakdown Semion Mazor: as transparent as it can be if Fa 6 on top doesn't provide cost savings. It also showed that as well. Semion Mazor: I can go into the whole different assumption parameters that workload factory take took into account. I can export it as Pdf. To share with other stakeholders, or for myself. Semion Mazor: and it's a nice place to start with to explore the cost savings. As we've already said, it can be significant. Semion Mazor: Another option is to create clones, as we said here they are called sandboxes. So also there's kind of nice dashboard to Semion Mazor: see the information about existing clones. And then there's also option to Semion Mazor: use to create new environments. Just need to provide the database source, the target information, the mount options that I would like to choose Semion Mazor: and add the tag. And also here again, option to use the code box to automate this process as well. Semion Mazor: So this is a very nice tool to. Semion Mazor: to,make the operations aspect more easy Semion Mazor: and just wanted toshare with the Semion Mazor: the audience that we have today on the Semion Mazor: on the call. You can do it easily by going to console.wacos.net.com, and try it by yourself. It's free of charge. Semion Mazor: So going back to our Semion Mazor: deck. Semion Mazor: I think a team that. Semion Mazor: like your story, is just Semion Mazor: one great example of how Fsex from ontap can help. Semion Mazor: And Semion Mazor: when you're deploying Ms SQL. In this case. But the same applies for other databases as well. Semion Mazor: And Gaza, would you like to add something. Rodrigo Gazzaneo: No, atthis moment. You know, I think you talk about use cases, databases. One use case that we see happening a lot Rodrigo Gazzaneo: right like I mentioned, while Ontap is established and mature, been on the market for decades. Amazon Fsx. Is not, is only 3 years old. Rodrigo Gazzaneo: and in Amazon what we do right? We have this concept about working backwards, right? Meaning. Rodrigo Gazzaneo: We work from what our customers require from us. So Fsx has been a great example. Right? We have been offering storage in the cloud. In, you know, different technologies, native technologies and some well established and known technologies. Rodrigo Gazzaneo: Ontap was a requirement. We saw more and more customers, you know, being successful with enterprise, grade capabilities that ontap serves willing to have them equivalent in the cloud. So we are able to partner with ontap and bring this solution together. The evolution continues right, and it's based on the feedback we receive from our users. Rodrigo Gazzaneo: and there's a lot of innovation happening on both aws and adapt workload factory is a great example right of the innovation happening on ontap to try to break the barriers that hey? It can be complex. There are too many variables to change right. The cloud is really powerful when it comes to tuning, but the good thing is that you can continuously improve your deployment. You're always improving as your workload evolves as your requirements evolve Rodrigo Gazzaneo: right? You can always be Rodrigo Gazzaneo: changing and also adopting new capabilities that are launched little by little. Rodrigo Gazzaneo: And finally, you know, just to end this monologue here. Really want to thank team and the advisor engine team, for, you know, trusting on Aws and our partnership with Netapp with your most critical digital asset that you have. Semion Mazor: Yeah, thank you for that, Gaza. Before we go to the Q. And H. Part, I would like to point a couple of available resources there. So, as we said, workloads factory is available to use. And if you would like to have a call with an expert Semion Mazor: to continue the conversation about Fs 6 on tap, how it can help your deployment and maybe help with water factory or clean any other aspects related to your process. You're welcome to use the QR code on this slide. Semion Mazor: There's also a couple of other resources. And there's also page for the advisor engine customer story. It doesn't go into the same details we covered today. But there is some summary of the story, and also other resources for more information about. Semion Mazor: and F. 6 ontap and databases. You're welcome to use those resources as well. Semion Mazor: And let's see if we have any questions that we would like to address or team. If you have something else that you would like to highlight and mention. Tim Gitchel: No, but if we have any questions that I can answer, be happy to. Rodrigo Gazzaneo: yeah, they're no. Rodrigo Gazzaneo: yeah. There are no questions open on the Q&A queue, pedro has been doing an amazing job. Rodrigo Gazzaneo: But you still have afew minutes to Rodrigo Gazzaneo: to address any anyone that is coming. Rodrigo Gazzaneo: And while we are here, and we only have a couple of minutes left. Let me do a shameless plug. So we're going to be back aws and Netapp in 2 weeks, and we're going to have a similar webinar but this time we're when we are also gonna cover workload factory. But this time we're not going to be focused on databases with Fsx. We're going to be talking about generative AI, Rodrigo Gazzaneo: and how the power of Fsx can help. I'm going to ask this V toshare the link for registration. If you're curious on the topic. Rodrigo Gazzaneo: but it's just a topic that has been so hot through the year that you know, in many cases we get to focus on the use case and forget about what we truly need to be successful with AI, which is data right? And this is what Aws and Edapp can do very well together. So if you're curious, just take a look at the page. Rodrigo Gazzaneo: and if you have questions, you know you can continue the conversation. Use theQR. Code, andreach out toall of us tohelp you. Semion Mazor: I think one comment and question that always arises is. Semion Mazor: actually there's 2. 1 is about the self-manage option. So yeah. Fsx font type offer all that for self-managed databases. And we're focused here today on SQL. And there's commonly a question about other databases. So for self-managed database, you can install your own databases, and choose whatever version of the database you prefer. Semion Mazor: And so the same Semion Mazor: benefits applied for any other database that. Semion Mazor: and that your organization use. So it's also. Semion Mazor: and there's not no limits on this aspect. Rodrigo Gazzaneo: yeah, oracle postgres. Right? If you need an engine and you're building, your infrastructure on aws right consider that the infrastructure is going to be fully managed in this scenario, and you're going to have flexibility to operate and manage yourplatform Rodrigo Gazzaneo: right? And of course you know. Aws, being this large portfolio of cloud services. There are also fully managed database services right that some dbas like to use Rodrigo Gazzaneo: right. But here, you know we are talking specifically about the situation when we are deploying and managing our own databases, and there are many legitimate reasons why that is going to be your strategy moving forward just like. There are other reasons why some other organizations or some workloads in your environment might be interested in fully managed databases as well. Tim Gitchel: And I'm assuming you could also utilize on tap Tim Gitchel: on an on prem basis and have that snap mirrored into the cloud, so you could have Dr. In the cloud that's being facilitated by on tap Tim Gitchel: and do some of these patterns with not just between regions in the cloud, but between on Prem and the cloud. Rodrigo Gazzaneo: 100%. Semion Mazor: All right. So we're also over now a minute over our time. So I would like to thank you. Team. Thank you for being with us today. Thank you for sharing the story, and thank you, Gaza, foryour help here, and to Pedro for your help in the chat, and to you for running the session. Semion Mazor: and of course thank you for all attendees for being with us today. Tim Gitchel: Thank you. Talk to you all later. Pedro Fernandes: Thank you so much. Simian gaza team. Pedro Fernandes: Bye,Rodrigo Gazzaneo: Bye, everyone. Semion Mazor: Bye, everyone, thank you.

How AdvisorEngine enhanced SQL Server operations

1 year ago

AdvisorEngine achieved its growth objectives and enhanced its database deployment by leveraging Amazon FSx for NetApp ONTAP to manage its MS SQL environment.

Amazon FSx for NetApp ONTAP