BlueXP is now NetApp Console
Monitor and run hybrid cloud data services
All right. Hello everyone uh and welcome to our webinar uh sixst step for a successful data migration. Uh my name is Odet Bman. I'm a cloud evangelist from the NetUP cloud innovation center in Tel Aviv. With me here today is Ben Casten. Hello Ben. >> Hi Odet. >> How are you? >> Awesome. >> Awesome. >> Awesome. Thanks for having me. >> What you doing here at NetUP? Well, >> other than being a global expansion manager,>> uh global expansion manager for NetApp data sets, which is cool and awesome talk about uh in this webinar today. >> Indeed, we are. So, I'muh I'll start with a question maybe for the audience. Is there anyone here that was involved in a migration process that didn't experience any migration failure, missed the deadline, maybe overrun a budget uh or ran into someissues uhwith data uh after the migration was uh completed? Well, you know, lack of uh insights into your data can do that for you. So, uh with that said, let's go over through our agenda for this webinar. So in this webinar, we'll talk about some of the challenges that you might be facing uh within your migration uh projects and what actually makes them fail when they do fail. Uh we'll go through uh six steps uh that can be taken that will increase the chances of having a successful migration project. And we'll show you how or Ben will show you how uh we can assist you with your uh migration challenges and run through a demo. uh we would leave some time at the end uh to address your questions. Uh the session is of course recorded and it will be available on our website probably uh by end of day tomorrow. Uh and as well uh if you have any question feel free to send them over to us and please use the question pane uh on the go to webinar uh control panel.So with that said uh I think we can uh get started. So I'llgo ahead and I'll start witha couple of uh items uh at the beginning on the agenda and then I'll uh have it over to you Ben. So each migration process right is somewhat unique uh involving different kind of workloads and data sets maybe different architectures uh tools that are being used andso on and so are the reasons uh for it uh for the failures that we may uh we may encounter. However, when you trace back these uhwhat le led to those failures, uh the number one reason that is often found as the root cause of it is lack of data insights or lack of insights uh into your data. Now getting these insights required uh is one of the biggest challenge for essentially every enterprise every organization and that goes beyond oruh even uh become especially challenging uh when you deal with data states that is distributed across uh different data sources uh different storage environments uh across multiple regions maybe multiple clouds uh and so on. So not knowing uh where your data uh that needs to be migrated uh is located, not knowing what portion of the data uh can be migrated or which business units will be affected and so on that could le lead to uh uninformed decisions uh that may result at the end of the day in a cascading uh devastating series of failures.Um just aside this one. All right. So, uh without taking stock of the data to be migrated andget full understanding of which teams or which stakeholders will be affected, uh our business operations may be disrupted. Complying with regulations may be violated if data is moved unproperly ordata that shouldn't be moved is moved. Uh and application on boarding uh can slow down. uh leading at the end of the day right to misalignment uh with our business goals. Moving on uh lack of insights likewe said I mean insights into the data is really uhthe root forevery migration process. It would determine whether that migration process is successful or not. Uh so lack of insights could also lead to uhpoor planning uh which in the long run it may lead to uh issues such as uh improper uhbudget scoping uhmissing uh migration deadlines uh and so on. Have a lot of animations here. If you don't understand what's uhis in your data so you cannot prepare it properly. uhfor when time uh arrived tomigrate it to its new location. And you may migrate irrelevant data uh that would extend the time of the migration might add some additional costs to it. Uh and in case as we've mentioned that data that is not allowed to be uh moved due to any sort of uhregulations that are imposed uh then fines uh may incure as well. Anotheranimation uh with the awareness uh without any awareness of the data to be migrated uh the import an improper migration tool or the migration methodology that would be selected uhwould be improper and that can lead once again to migration slowdowns. It can lead to additional costs uhand delays uh in the deadline. uh it can in worst cases it can even cause uh data losses uh and this is something that yeah we definitely don't want to happen and finally uh if we overlook security details uh in the data we might accidentally you know grant unwanted access causing breaches uh or on the other hand revoke access uh causing unwanted uh delays to the entire migration process uh overall. So taking all of these challenges, taking all of these uhpoints where our migration uhprocess could fail, uh we sort of came up with uh a six-step path or six items that uh you need to be concerned of uh that implementing those orfollowing those uh highly increases the chances of you having uh a successful uh migration project. So knowing your data uh would be the first and foremost the most vital phase uh in your migration process. Uh as discussed earlier right this is the root of your uh whether your migration would be successful or not and everything that would follow uhat the end of the day depends on it. So what does it mean to know your data? So it means that we begin with finding where all the relevant data is stored. uh it's not an easy task right when operating at scale where our uh data is uh scattered around in different storage uh systems different databases uh on premises cloud and so on. So this is one of the things that we want to have uh to begin with is also we want to ensure that we have full visibility across the entire state uh where our data resides. Once the data was located uh we can then start determine what needs to be moved and what can be moved. Now moving everything uh that we've discovered right that might be the easy path tostart with um but we want to avoid uh all the implications that it will create right in the long term. So in the long term sorry so identifying where the mission critical data is uhwhere the personal and sensitive information is located uh or you know just identifying any type of data based on whatever criteria uhthat is important to you uh isimportant. Different types of data at the end of day at theend of the day might require different treatment. They might be needed to uh move to uh special locations. They might be uhmight be necessary uh to stay uh where they are at the moment. Uh so this is another thing that we need to pay attention to. With that information in hand uh we can now start building up our migration plan or our mig migration timeline. So the more accurate the uh the insight that we gather uh around our data and the clearer and uhmore readable they are the chances of having a successful migration of course uh would be higher. Aligning your business uh would be the next step. So it's important that first of all to realize that ifwe were IT, right? So migrating data is not just an IT thing. We'll probably be in charge of the migration, right? But there are uh businessreasons that led to it as well as all sorts of restrictions. So to begin with, we want to have of course all the businessleaders and all the relevant stakeholders. We want everybody to be involved, everybody to be on boarded with the process. uh with all the stakeholders on boarded we can have these uh clear insights sorry withthem on boarded and with the insights that we've gathered uh we can now uhrealize the few things we can know or determine what would be our business goals. Uh we can determine how the migration process uh might affect the business operations. uh and of course uh who needs to be informed uh when changes are made. Next thing would be is to uh scope the shape uh of our project andthat means that uhthis is actually uhwhere most of the failures happen uh due uh to lack of knowledge uhon our data. So with the insights gathered uh and the business goals uh determined in the previous steps, we can now have a better estimation of time. The time it would take for the migration process uhto complete the time it would take the relevant workloads to be uh migrated and uh of course it will allow us to better uh plan and determine our deadlines. Uh we should also be aware uhwe should also we would also be aware of the cost that the migration uh would incur and set up a proper budget uh for it.Next step to follow uh would be getting your data ready or what we'd like also to call rins and cleans. Uh it's all about essentially uh preparing and analyzing that data at this stage. So throughout all the phases of the migration project as data is added orchanged uh we want to have uh automated ways in place that would constantly analyze our data and have it rinsed and cleansed. So uh and of course we want this thing uh we want to be able to determine how that rinsing and cleansing uh is being done. So we want to have uh some policies around based on a wide range of criteria maybe that automatically scans and classifies our data allowing us to essentially separate the data set that is intended to be migrated uh from the rest and all of that of course without having uh any implications on our business. So when it's time for the data to be moved, right? So we can avoid moving sensitive data that it's not uh allowed uh we want to be able to exclude uh data that isn't relevant for this project that might slow down theuhthe transfer process and may cause issues right in the uh environments uh in the target environment. And of course, one of the things that we really like to do is essentially identify all of that junk data. All of that duplicate data that we probably have in our environment. All of that data that it's inactive, hasn't been accessed for two, three, four, and so on years. Uh these are uh some of the things that we don't want to have uh within our data set. So now that our data set is ready andBen of course uhwillshow you he will talk about data sense and we'll show you how it could be done uhthrough data sense and how it leads to a successful migration how data sense essentially allows you to address all of the things that we're currently discussing. So now that our data is essentially uh ready uh it's time to move it to its uh new environment. And so migrating the data itself at the end of the day is really one of the critical andmost challenging inalmost any migration journey that you would take. uh and the success right of the migration uh is highly dependent on choosing the right migration solution or the right migration tool that would do the heavy lifting of the data for you when migrating the data uh there's a lot of associated challenges uh to overcome a lot of things that we need to uh consider such as I mean how do you make migration automated uh and efficient so the process won't take uhlonger than planned andwe won't risk over uhoverrunning the budget or uh missing the deadline. Uh how to avoid uh downtime right during the migration process itself. So our business operations is uh are not affected and the business goals are met. Uh how can you migrate from uh multiple data sources? we might have data scattered around NFS servers, SMB servers, some object storage uh and so on. Andanother thing that is quite crucial as well is how can we guarantee that permissions, ACL's, metadata and so on uh would be migrated uh properly uhto the new environment. And it's important to understand that not all of the available tools today uhcan address these concerns. And so no choosing the right one uh is critical. Last thing I would discuss and then I'll hand it over to Ben is uh ensuring security or keeping your data protected uh safely accessible and of course under control and we want to have that during the migration process as well as when the data lands in its new environment. So this is also another factor that we need to take in consideration uh within our migration process. So data that exists uh in the original environment it would have to be accessed right at the destination by probably the same applications same users same teams and so on uh once the migration is completed. So it's important to determine at that phase how we are going to enable that right by determine you know whether the permissions will be migrated as well or maybe this would be something that we will configure once the data lands on its destination. Um uh it's also uh important to ensure that permissions aren't accidentally granted to others. This is something that uhwe know can happen. weheard a lot of stories around it. So breaches you know can easily be uh a result of such tasks. Uh so careful attention is required here. And lastly uh retaining active directory uhgroup membership should be uhconsidered as well. Authentication methods uh and so on and so forth. So uh to sum it up, we have a lot of uh challenges, a lot of uhpoints during our migration process where the migration could fail to some extent and uh whatleads to that uh in many cases that uh if you go ahead andyou know investigate this andback trace your uh entire set of operations and so on. So in most cases the failures uh would be due to lack of visibilityuh into your data. So with that said I'll hand it over to Ben uhwho will talk about data sense and uh walk us through uha demo so we can see how uh we can help you within your uh migration process. >> Awesome. Thank you Odette for that and indeed our experience shows us that uh understanding the data is a critical point um that will help that holds back our customers from being successful uh with a migration project and once they actually utilize a NetApp data set which is part of Blue XP first of all they're able to tackle many of the concerns and many of the challenges that youdiscussed uh up front and if you actually look at the challenges that the customers have. Um it's uha little bit like this poor fellow who's trying to organize his office with the incorrect solutions, right? So it'suh no using a vacuum cleaner tomake order here is just the wrong solution, the wrong tool. And that's what we see many of our customers uh failing with inthis process because they're just not utilizing the right solutions and the right processes togain order and to gain governance aroundtheir data. Um so before we actually talk about specifically migrating uhor successfully migrating to the cloud, I just want to mention that data sense helps our customers not only in successfully migrating data to the cloud but also the storage optimization on the ongoing basis. Like you said, rinse and clean, right? It'srinse, clean, repeat. It'sa repetitive action of keeping and sanitizing the data before a migration, during a migration and post migration to make sure that our data is sanitized properly. But not only that, the security is also a key factor, right? Because even if we manage to migrate everything properly and Odet today you're inone uhdivision of the company for example you're in HR today and you move to finance but when you moved your permissions still remained open to the HR and suddenly moved and you have permissions to files that you do not need to have. So that understanding of permissions and always keeping the minimum required permissions is key uh environment in actually uh keeping uh the storage optimized and another key aspect of that in migration and um ongoing is compliance. Imagine a customer who is a EU based company that stores data with regard to his customers from the EU. When moving to the cloud, regulation dictates that information about the customers will remain within a European region. Well, while migrating data, first of all, how do you know to locate files that contain personal information about an individual who resides in the EU? And how do you know where you actually place that file when it's migrated to the cloud and that it remains within a region in the EU? All of that are challenges that customers are faced with and what data sense could.Okay. So, NetApp data sense the uh the star of the evening or the star of the day depends on what part of the world you are um is a solution which automatically maps, classifies and categorizes data across any data source. So, as long as um you have fileshares are connected to SMB NFS, you're good to go. You could use data sense. You could have any uh storage vendor out there uh with your data. We actually could scan uh one drive, Google drive, sharepoint um any of those applications to make sure because literally those are also storage repositories today. Um database servers. So cloud storage. So imagine your all of your data play that stretches from data from file shares to one drive for example to Google drive to wherever to cloud storage we scan everything and look at it and holistic you what data do you manage and we're able to easily classify categorize that data so you could actually take actions in a very short time. So just to talk about um our support matrix and what we actually could scan. So any file storage, object storage, databases you could see on the screen here really there's no limit uh to what you could scan whether it's SQL or NoSQL databases and any really any file repository out there is uhscannable by uh data sets. So let's uh touch on just one moment how does it actually work and why is data sense so unique in this area. So first of all let's talk about the deploy. Deploying data sense is super easy and super simple. It could take anywhere from half an hour to maybe an hour or two. Reallyquick. Um it's agentless. So in order to access the data you do not have to actually add any agents. you add just astandard us standard user to the fileshare andgrant access and you can scan the data um and we automatically upgrade and update so really the maintenance is super low because once it's connected push the updates we update frequently you get the updates now I just want to say that we have a few flavors of deployment you could deployment deploy this in the cloud which is your own VPC right so your VPC is under your control. So even if it's deployed in the cloud, nothing no data or none of your data is actually sent out to NetUP or any other party for analysis. Everything remains within our environment. We have an installation flavor which is an on-prem installation which requires two VMs at a minimum or we have a dark site. If you're a security uh company or a government agency or whatever that just doesn't want internet connectivity um to your systems, we also have a flavor uh to do that. So, we really have all of those flavors to accommodate any of your needs.Um as we mentioned, in order to scan, you just need to grant access to the fileshares and data will automatically start scanning um your data. The scanning is always on an incremental form. What means is the initial scan will take some time, right? Because we have to crack open the files, analyze them and close them. But once that's done, the scan is incremental. That means that if a file, if a file in specific share has been modified or opened and modified, we will rescan it and make sure that nothing has changed and then and so on and so forth. So that way uh we're veryaccurate. were very nearly uh real time and very uhlow uhload on your environment. Let's look at the insights. The insights that we provide are out of the box. You do not have to ask data sense to do anything. Once you scan, you'll start seeing results out of the box almost immediately. You do not have to teach data anything. You do not have to configure anything. Everything will be out of the box. However, we do offer the opportunity to actually customize data sense. You could customize categories. You could customize what you're searching for. You could add uh your custom regax if you would like. So, we have the out of the box layer and then the additional layer you could provide on top of that which provides you additional depth. And to close the loop, we also allow you to take action from within data set. So for example, you find files that are stale that you want to delete that through data sets. You find files that you would like to move to a different uh storage repository, that's also fine data set. You do not have to leave cloud XP um dashboard and the cloud experience in order to do so. Everything is data. >> Let's talk about some of the use cases. Okay. uh that uh >> yeah that's a cool >> we could do uh through data sense uh giving uh >> uh all of these capabilities that it provides. Okay, so we talked about um helping customers migrate to the cloud, right? That that's the essence of our uh webinar today and let's look at the left hand side. We have the IT enterprise which is onrem. You have files on NFS, you have SMB, you have some sharepoint, one drive that's where your data resides right now. These are the steps that will take you to the cloud. So first of all, first and foremost, you would scan your data to understand whatyour data state looks like andlike you mentioned uh od planning is the baseline to everything getting the baseline of what you have. So imagine that you have a migration project coming. Wouldn't it be great that you would know in advance what data you have, what files you have, what is stale, what is non-b businessiness related data. You would know that upfront. You would be able to plan accordingly. Create a budget. Also know your landing zone, how much data you need, how long it will take. Everything will be suddenly clear because you have all of that data sense so easily and really in a very short time. Now let's talk about when we migrate the data. What typically do we see in the investigation? When we look at the investigation process, first of all, I would advise to take it step by step. First of all, let's get rid of all of the stale, right? And stale data means different things for different customers. And data sets will allow you to class to search stale data according to your own classification. It could be for one customer files that are over haven't been modified or accessed for seven years, some are five, some are three. Really doesn't matter because data sense has that flexibility. But in most cases we see that stale data is up to 50% of the data that customers actually store is stale and they do not even need to store it or worst case maybe archive but 50% is stale data. Another 20% are duplicate files. Well you all know how we actually receive an email with an attachment. We just download it to our a one drive share. It just sits there. So a lot of data just duplicate data. You may want to save one f one copy because it is important to have it but still the fact is you don't need all those and non-b businessiness related dates people just download personal stuff and that's another 10%. So what we find is that nearly 80% of the data that today resides on the enterprise IT fileshares are not even relevant for migration projects. Now, how awesome would it be to dismiss 80% of the data up front and easily and just deal with the 20% that you have to move? You could do that easily within minutes data set. Now after you realized and found that 20% of data that you really want to move then you could start digging deeper into that data and automatically finding PIIs finding sensitive data in those files classifying those files according let's say categories where do these filesactually have to live if these are finance related files they all need to new home for the finance file shares right onthe cloud that really allows you to um put your emphasis on the 20% that you really need and you could do that easily with data sets. And taking action is the next phase where you could first of all either delete or move let's say to abucket all of your stale data, delete all the non-b business related data, delete all the duplicates, right? You could take those actions um or you could create policies because a migration project is just a flip of the switch. It takes some time. So you create a policy through data center saying hey please identify non-b businessiness related files or please identify duplicates and you could then throughout the process just make sure that you're really keeping on top of your game in the data uhmovement. Um then you would create a clean copy to move your data. That means okay after you archived everything, moved everything and created all of your um clean data that you want to move to the cloud, you create a clean copy that then would be migrated easily to the cloud using uh the native uh cloud sync solution from uhblue XP that could actually move your data from and cloud sync is acool solution that could actually uh um migrate uh data from any data source and place it on any sort. So imagine you have files you search with data set say hey I found these 10 files for example or 10,000 files on this specific SMB chair I want to move them to a S3 just use cloud sync and it will do it quickly and efficiently it won't take that long so how cloud sync works it will within a very short time just migrate easily those files and uh put them in their new home for example on Azure so that is really uhin essence how we see the migration process and how that looks. So um in a nutshell migration project becomes really easy and effortless with data sets.>> Okay. Want to jump into the demo? >> Well yeah let's do it. Okay. >> All right. So let me hand it over change the presenter. All right we're good to go. >> Okay. So hopefully >> everyone could see my screen. >> Yes. >> Yes. >> Yes. >> Okay. My screen often. So um this is the uhnetup blue XP dashboard where data set is one of the services and we have a couple of dashboards here a couple of tabs within uh data sets. The governance and the compliance tab are the dashboards where you have the highle overview of the data. The configuration tab is where you actually add the different data sources. For example, if you want to add file shares, one drive, S3 buckets, you could actually do here. It's super easy. You integrate with we labeling. We uh connect uh um active directory and this is where you would actually um ask data sense what to scan because connected data source with the user but it doesn't automatically start scanning. What it does is it connects all your volumes and here the configuration tab allows you to say okay I want to please just map this specific volume when we say map we're looking for the metadata of these files when we say map and classify that's when we crack open the file to the cache of data sense look for all the sensitive information all the data the information that contains within the file and um that's what we do Um again I want to reiterate for those who maybe joined late ormissed that when we scan the data we do not send it anywhere outside of your um environment. So it's on your VPC. The data doesn't get shipped out anywhere. Nobody sees it. It's under your focus. Okay. Being said that let's look at thedashboards of governance and compliance. And after that we'll take a little look at the investigation tab. show you how easy it is to slice and dice the data. You don't need to be a data analyst or of that sort. Really simple. You could do it um at your leisure and see how cool it is. So in this demo environment, we have about 500 gigabytes of data. And in this case, we found 102 gigabytes of stale data. This case it we say we configured it to last modified theories are over. But as you will see in a moment, uh we could change that very easily. non-b businessiness related data 17 gabytes and uh 92 gigabytes of this is quite a lot right so imagineyou going into amigration project and right away you have here nearly 50% of your data which you do not even need to deal with how often would that now moving forward we look at the uh different data repositories that we have and the sensit activity level. For example, um we have here 140,000 items with personal identifiers. And here there are sensitive personal files that are um here. But look at the file shares. Look how many files inthe file shares contain sensitive personal data. You need to know about this when you're going into a migration problem because this could be really problematic. Um moving on to the I would call this sensit sens sensitivity heat map. So we have the number of files uh down here and we see um uh restrictive versus permissive access and over here we see uh the sensitivity level of the files. So obviously if we have files up here that are permissive and contain sensitive information that's where I would put my emphasis on when we're talking about but not only that but then you could see right here that 77% of the files of the organization are open to the organization and that begs the question why is this really necessary? Maybe we could restrict that a little bit.Okay, moving on to age of data, right? We talked about age of data. So we have the modified, accessed, uh, created and last accessed. So what we see here is the timeline and the number of files in each timeline. So we see for example that we have quite a few files that have not been modified in three years or more. Maybe we don't even need to migrate them. Maybe we could just move them to a an archive. How cool will that be? You could just [clears throat] dismiss all these files in advance. And obviously another slice and dice by size. Now let's talk about file classification because this is where actually the brains of data sense comes in and this is again done automatically out of the box. You just need to connect your data sources and this will be done for you automatically. So remember we will scan all of your data sources and we will automatically create I call them virtual buckets according to the topic of the file. So we will scan all of your files andplace all of your files into these virtual buckets. For example, a virtual bucket that contains all of the uh files that are related to HR health or all of the legal NDAs, right? we will actually put them in different buckets no matter where they are. So that is uhquite useful for our customers because that already allows you to start making some order creating some order in your but we also look at their personal files um or personal identifiers and sensitive personal identifiers. So personal identifiers could be anything from credit card numbers, password numbers, um driver's license, national IDs. All of these are again created out of the box. These are all identified out of the box. But we also allow you to find um and create um what we call data fusion identifiers. Data fusion identifiers are identifiers which are unique to your organization. You could also do that as well. You could also create custom regx search and you could also create custom categories for the file. So we provide you the first layer of capabilities that are out of the box. And for the next layer, you're actually able to easily through the classification setting tab right here actually create your own custom searches through regex or what may be in order to create the next layer of data search. So those were dashboards and that just gives us a high level uh view of what we have. But let's look at um the investigation tab. And this is where really uh data sense shines because up until now customers that really wanted to do this across plane were just either had no idea or no way to do it or used tools that were very limited. Tools that maybe only looked on file shares, separate tools that looked at cloud storage. Another tool looked at databases and none of them looked at uh PIIs and metadata together. Right? So up until today there was not an opportunity to look and investigate your data at the holistic view. So let's take an a few examples through the investigation tab and really try to understand how you could do this. So we use this very simple but very powerful filtering capabilitythat allows us to investigate the data thatjust uh let's uh make this interactive. >> All right. >> What hap what do customers usually look for first when they want to um migrate data? What would be the parameters that uhyou would want to look for in the beginning? Let's maybe start with uhsensit sensitivity, personal data, sensitive information. >> Sure. So what we could do right now is let's look at you see we have the different filters and we could look at uh files with different sit sensitivity level personal nonsensitive or um sensitive personal but we could go even deeper. This is what I'm going to do. Let's look at personal data and let's look at files for example that contain credit card numbers. So we want to first of all find all of those files that contain uh credit card numbers. So we had uh nearly half a million files in this demo environment. And when you just click this credit card um immediately we're down to 449 files that contain credit card numbers. Now, first of all, you may want to delete these or take a look, but right away it reduces the risk of finding uh or um migrating files with personal data in them. Um and you could see here that all of these are on S3. That's fine. Um you could also actually um filter by uh for example storage repository. So you say, "Okay, I don't want to look on S3. I want to look on a specific um uh >> SMB server." >> Okay. Yeah. >> Fileshare. >> Fileshare. Okay. Let'ssee if there's a SMB server. Yeah. Exactly. So you could look at working environment um or you could really look at it at differentuhways to slice and dice it either on repositories, working environments or so on and so forth. So um here on this demo environment, let's look at SharePoint for example. one item on this SharePoint. Now, SharePoint. Now, SharePoint. Now, we went from half a million files down to one single file on the file share that contains a credit card number. Now, finding one file is like finding a needle in the haststack literally. Okay, so we found this one file, this needle in the haststack. Let's have a look um and see what we found. So, this is a doc file, docx file. We see all of the metadata about this file, right? Um, and we see the when it was modified, when it was accessed, all of that. We also see that there no open permission. So, at least that is okay because we're not sharing this. There are no duplicates, which is also a good sign. But we see that there are um some personal data in here and obviously we were looking for that credit card. But we also found that there are email addresses here, right? So not only credit card numbers but email addresses. Now pay like I would like to uhum show you that um we actually anonymize the data that we show in the data sense uhsearch engine. So uh credit card numbers or email addresses are actually anonymized. So um that's another layer of uh um security that we provide when you use data set. So, not only does the data not leave your premise, your environment and your control, but any uhpersonal data that we do find is anonymized. Wait a second. This file contains more information. It contains sensitive personal information, not only credit cards. You found a good one. We found medical codes in this file. Have a look. Now I'm guessing that this file is not a good idea to move uh somewhere not secure. Maybe we should leave on prem. What you >> Yeah, I think you're right. >> Okay, so let's leave this file on prem. It contains too much information to move to the cloud. Now that without a solution like data, how long would it take you if at all you would be able to find uh this data? Andthat's a question to you our audience and our viewers. This is a power of data sense. How easy it is to find this needle in the haststack with data sense. Super easy, super. But let's look at another uhsolution or another filter that we have through this very simple filteration process. For example, let's look at files that have not been accessedforoverthree years. for example, we're down to 497 files. So again, for example, my organization doesn't care about saving or retaining files um that have not been accessed for over three years. We would just discard these files right away. And you we also give you the opportunity to create a custom uh search. So let's say last access you could do uh older than or have not been accessed in the last. So let's say uh two years. Okay, we will submit. So here are all the files that are older than two years right away. But let's add another filter dead and let's look at all the duplicate files that have not been accessed in the last two years. We could just check as duplicates.54,000fileshave duplicates and have not been accessed in the last two years. This is 10% of our data. Remember thisdemo environment had nearly halfa million files. 50,000 I mean whykeep this also they have not been accessed. They duplicate files just delete them. And that takes me to the next capability of data set. Let's select all these 54,000 files, right? We just select all of these. And this it's grayed out here because actually delete the files from the demo environment that would uh but you could easily just click the delete button and they're gone. Right? So, this is really something easy because you could delete these files andjust remove these files. But maybe you don't want to delete them right away. A best practice wouldn't really be to delete them right away. Maybe let's move them to a quarantine uhshare just before we want to delete it. You could actually move these files or copy these files easily through the NetApp native solutions within uh Blue XP and just move these files elsewhere and get rid of them for the migration project and look at them at a later time just to make sure that these could actually later on be uhdeleted. So you could use cloud sync, move these files to a different location, take care of them after, but you don't have to worry about them during your migration process. It's one less thing to worry about. >> And you could use the same tools to perform the migration itself. Right. >> Exactly.So you could use the same tools to clean up your environment and once environment is clean later on do the migration as well through these solutions. But this is the power of data sets. The easy way how to research your data but later on actually take action. So really have the full scope and the full breadth of um technologies. You don't have to exit data sets or blue speed at all. You really create all of these uhworkflows through data sets itself. So we give you the power to actually complete the migration um up front. So yeah, >> that was just a short presentation of the power of data sense. I hope it was useful. Any uh questions from the crowd there?>> Yeah, there's tons of it. So I suggest that uh >> uh let me do this. Let me grab back uh the screen. >> All right. uh the deck uh we would I saw some questions regarding the deck. So yeah, we would share the deck in a PDF format. It would be part of the uhit would be inthe website tomorrow along with the uh the recording. Uh one of the things just to note uh within that deck you would have afew uh resources that might come handy uh if you want to get some further information on uh data sense. So there's a link for the data sense or blue XP classification web page. Uh we've attached also a link to uh a new ebook that was just released uh u covering uh all of the these angles that we've discussed uh called why data migration projects uh fail and a link to the technical documentation for uh anyone who wants to get started uh and learn uh how to let's say uhum uh deploy data sense uhuse the investigation tab uhconfigure some custo customize PIIs uh and so on. Uh so let's do that. Uh Ben, if you could uh open your uhcontrol panel, there's asection there called questions. So if you could uh just click there on the arrow, it would uh pop the screen out and you could expand it. questions. Okay, >> you know what? Let's take a look. Yeah, let's take a look at >> So, first of all, thank you everybody for uh sending those questions. Yeah, let me just uh >> delete these ones. All right. So, all right. So, let's look at the let's look at that question here. So how does data sense know uh that data is sensitive? What parameters it is checking uh to flag as sensitive data? >> Cool. So first of all how it works is really our uhNLP technologies that scan the documents and know that it's a sensitive data. Sensitive data is uh created by regulators, right? sensitive data uh is sensitive data is um uh by GDPR and other regulations is deemed sensitive according to the regulator. So that's how we deem it if it's sensitive. Um and the way it works is our that's our technology that knows how to quickly scan the data and actually deem it as sensitive data uh personal data that could be driver's license or IP addresses or email addresses but personal sensitive data that is again data that regularly and by the way if some data is sensitive to the um specific organization you could actually create your own custom searches to find that sensitive data as well. >> Yeah, I see that a couple of people asked uh the same question here. All right, so just uh remove that one. Let's move on to the next question. Uh will the search include special characters likeuh Idon't know how to uh how to call that special character, but does it count for special characters and accents? Exactly. Yes. So forinstance Russian orJapanese or Chinese languages. Yes, it does. >> Are there any uh>> there are particular >> they are but that's an you could uhwe could uh refer to our documentation there.are particular parameters. >> Okay. Uh when you see duplicates if they are located in different locations or sources can you choose which duplicate to move off? Yes, could. Yes, could. Yes, could. That's a great question. We will actually show all of the duplicates and you could see the source and you could decide which duplicate you want to save and which duplicate you would like to delete. Great question. And by the way, that is really the power of data sets because we do find duplicates across multiple environments>> across the entire estate. >> Exactly.Um, another question I think you'vetouched that uh onone of the slides. How data sense will have to will help to migrate other OEM data like uh HP threepar to net on premises. Okay, that that'sactually a great question because ne no matter what or who the vendor is, it could be HP or others uh once you're connected through SMB and FScould actually first of all uh scan map and analyze that data but also migrate cloud sync from those data sources to the new uh location. What if your system destinations are netup systems or on top based systems? >> So that >> could you leverage snap mirror? >> Yes, exactly. So if there are netup to netup, you could leverage snap. You could also leverage uh just NFS to NFS, right? Uh but clouds sync would be more efficient. But if it's netup to netup, you could do uh >> Okay, let's look at this one. Uh not sure that I see all the question. Okay. Uhpart of part of onre cloud migration process. Does data give you some kind of guidance where to move the data uh where to move the data during the migration process like move a data set to CVO or move data a certain data set to ANF or S3 and so on. >> Okay. So that really is up to you as a customer to decide files that are classified by data sense, right? You could say, well, I'm a finance company and I need all myExcel sheets that I work ona daily basis to be on ANF, but maybe the HR documents, you know, my employee contracts that I just need to keep on file, not access them every day, those I could move to. So it really depends on what your organization is, what you need and where you need it to be. But the classification helps you make those decisions.>> All right. Um, another question uh here is what is the input for data sense? Do we point data sense to our on premises netup or in that case you know any other uh storage repository and let it scan or do we need to perform data collection before we ingest the data? no >> into data sets. >> You do not need to do that. You do not need to ingest any data. The only thing that you need to do and I showed that in the beginning is create a user on your system and let data sense uh with that user access your data. Basically enter in your credentials have that user to data sense and that's it. >> All right. Umhow long how long uh it takes to implement uhhow long uh it takes to implement uh data sense on an on premises environment without internet access. >> It's pretty quick. The challenge that uhfor environments without internet access is we can access andlook at it live. But that's really theonly uh drawback. It'sreally quick. Uh literally you need two virtual machines. One virtual machine for Blue XP, the second virtual machine for data sets. Once those virtual machines are there, we'll send you the link for the and you could download a package, deploy it, and that's it. >> Uh do you need to have admin credentials?>> No. >> No. Simple as that. No. >> Yep. >> Yep. >> Yep. >> Okay. I see that there are some questions here. There are a few questions around pricing, how data sense is purchased, subscription, is it priced based on uh capacity um and how data sense is priced? >> Okay, let me take all the pricing questions andanswer them. So data set is uh priced by scanned capacity, right? So the amount of data that you would like to scan data sets that purchase and there are few options to actually purchase data sets. The easiest option is to buy it through your preferred hyperscaler, right? You could go into the um Azure marketplace, AWS, Google marketplace and actually data sense through um through the hyperscalersuh marketplace and you activate it through their super easy. Um, and you could and it's charged that'sa beauty. It's charged per capacity price is uh $50 per terabyte per month. So when you subscribe through marketplace you use data center for the duration that you require it. If you don't need it just switch it off andthat's it like any other place solutions out there. Another option is if you would like you could buy through your um net replic mechanism and that's another option as well. So really all of the types of flavors are out there, but the pricing is $50 per scan terabyte per month. Um, basically it straightforward. >> Uh, I see that morequestions are keeping uhh are flowing in. Okay. >> Do you have more time or >> Well, yeah, for our customers always. >> Okay. Sofirst of all, I mean uh yeah, wekind of scheduled that event for hours. First of all, anybody still here, thank you for joining and we'llsay we'll say >> we'll say and try to address uhthe questions that are being uh popped. Uh is there any free trial >> or data sense? >> Yes. >> Yes. >> Yes. So,>> how does that work? Super simple. Uh just Google search in your favorite uhsearch engine NetUP data science. It will take you to the data sense web page. through the data sense web page. You have one terabyte free of charge to try out data sense. Um that will take you to uh to the page where you can register for your hyperscaler and try it out. It's free. >> All right. Good to know. >> Uh due to confidentiality, there are certain files and folders that administrator and or domain admins are not permitted to access. Will these bypass?>> No, it wouldn't if the user doesn't have access. obviously wouldn't bypass also by the way uh this question may come up is encrypted files what do we do with encrypted files right that's another side of that question obviously we can't enter and we can't read them but we will categorize them as >> yeah >> yeah >> yeah there's actually a question here does data sense unencrypt >> exactly so I knew that question is >> was coming up so I answered it right away and no we don't but >> you'll know at least remember we're talking about migrations and even if youcan't read it still you know hey this is your bunch of files that are encrypted and were not scanned and analyzed by data sense maybe move them to a separate location so you could um investigate it further right by the way this a sneak peek to a ransomware solution what happens if in a certain file share encryption rate suddenly goes up>> it's a great indication for having ransomware>> but that's in the next episode Um, can data sense be purchased as a data classification tool that partners can take uh to clients and analyze data? >> Absolutely. PS, we work with many partnerships actually use data sense to benefit their customers and absolutely uh talk to us and we will help you set that up and use data cents uh for your partners. Absolutely.>> Uh there'sa great question here uh by VB I A Iraham. He says, "We are service provider. how data science can be offered to our end customers. >> So it really depends on the mechanism of how you would like to sell it. You could buy data sets as a partner and resell it to your customers with added value services right your consulting services uh or your services of data migration or your customer could actually be the owner of data sense but you could actually assist them in the process of migration. So many flavors are out there. Um, and it's all about helping our customers, right? Because we've see here at NetApp how our customers are really struggling uh to hit their goals on migration and any way that we could help isuhimport. So just talk to us and we'll help your customers achieve their goals. Okay, let me see. All right, there's a question here. I think you addressed this already, but uh it's good that we uh repeat that. >> Okay. >> Okay. >> Okay. >> Can you define patterns or ragex to match? We have sensitive data to migrate to Azure storage from on premises netapp like citizens ID account numbers driver licensing number in a specific uh format in different countries. >> Yes, the answer is yes. So first of all driver's license and ids for a majority of the countries is already out of the box in data sense. You don't have to configure in the event you need to configure that it's totally possible through the classification setting tab in data sets. Yes, it's there will help you to do it and it'ssuper easy to do and super helpful and very powerful. >> All right, let's see. Uh okay. There are uh due to users uh being able to create subfolders and save files in very long file names uh over 256 characters. Will this still be moved? >> I'm not sure about the move. Right. Ifwe're moving files with ifI understood the questions correctly >> uh are we able to move when we move files that have more than 256 characters does uhdo the file name um persists with whatever amount of characters it had on the destination? >> I've never encountered that. Sorry. >> Yeah, I know. not familiar with a lot of issues with that back in the days. But yeah,>> uh >> uh >> uh >> but remember the key aspect is finding those files with the problem. >> Yeah. >> Yeah. >> Yeah. >> Um another question here when we when these move schedule executes and experienced some failure, will these stop or still keep continue to the next file? We'll continue. Just skip it and report back. Hey, this failed right.Um, >> list keeps on growing with the questions.>> Yeah.Let me just uh clear those that we've answered. Um ifwe missed anything then uh we'llget back to you. >> Sure. >> Sure. >> Sure. >> In the next few days. Uh I don't know something's not working on this.Not able todelete these questions. Okay. I see a few questions that around the recording and so yes I mean the recording will be shared. It will be should be receiving I guess by tomorrow an email with a direct link to it. Anyways, it's going to be available on the website on blue xp.netup.com uhevents and it would include a link uh the slide deck that we've uh used. >> Yeah, there's a question about how long does it take, right? There's a question about how long will it take to search terabytes of data. So the question of how long isconnected to what you want to do, right? And what your data consists of. Uh scanning 100 terabytes of MP3 files will take much less than 100 terabytes of text files, right? So first of all, it really depends what the files consist of. But let's say it's 100 terabytes of text. So map, we talk about map and classify. Mapping the data will take a matter of a few hours. If you want to map and classify, our uhrule of thumb is between uh 15 to 30 terabytes a day um to map and classify per collector, right? So if you remember, we mentioned that you could have several collectors. So for example, if you want to scan a 100 terabytes in one day, I would advise to have three or four collectors, three or four VMs orum machines in the cloud, right? and then it will take you hey scan 100 terabytes. It really depends on the type of data that we're scanning and how many collectors. So there's variables out there that you could actually control and know um how quick or how slow you want.>> Okay. Onmy panel at least it's the last question. >> Okay. >> Okay. >> Okay. >> Uh Neta clusters on tap clusters already provides cold data reports uh the uh inactive data reporting. I think it's called uh what's the difference uh of data sense providing uhcold data uh information that you uh I think you showed us on the governance tab. So cold data is only one aspect of what data sense could do right because cold data on NetApp is one thing but cold data across your environment which could be NetApp non-NetP storage one drive Google drive sharepoint that and cloud for instance S3 buckets something that onep can't do so ourpower is vendor agnostic crossplatformand multiple parameters to scan not only cold data but for example cold data that contains personal information that's too >> cool uh while you were talking another question came in so let's take that last one >> the lad will be distributed automatically between collectors deployed in the same working environment >> load >> load >> load >> the load sorry >> yeah yes exactly So it's a it offers a scale out >> exactly scale out architecture where you would actually deploy as many collectors as you wish. The load would be distributed uhaccordingly and shortened.>> All right. My board is empty. >> I'll clear. Okay. >> All right. So thank you very much. Thanks Ben for being here and addressing uh everybody's questions. >> Thank you very much for having me. It was a lot of fun. And we invite our customers to try out data center. Was a question about a trial and it's free. So right why not? >> Yeah. And uh feelfree to reach us. Uh I don't think we've left our emails. Let me put it uh in the chat. Uh sowhat's your uh >> oh it's ben.casten >> ben with casten with in >> ke sen. >> All right. I got that right. So, I've put the uh put the email addresses uh in the chat. Feel free to send us uh an email uh and uh we'll go through once again uh tomorrow through the questions and uh in case we missed anything. So, uhwe'd get back to you. >> Yeah. And feel free to reach out if we miss something or you have follow-up questions to continue conversation offline.>> All right. So, thank you very much. Uh have a great day. Uh and uh see you in our next webinar. So
Learn why data migrations to the cloud fail and how you can overcome migration challenges using NetApp BlueXP classification, powered by Data Sense, for more successful cloud migrations.