Information life cycle management or ILM is one of the most powerful features of storage grid. ILM rules define how, where, and for how long we store, protect, replicate, and tier data. Like any powerful feature, it requires careful planning to implement properly. Before we begin, let's look at some general guidance. First, understand your requirements. How durable and available do you need your data? How long do you want to store your data? And where specifically do you want to store it? Understand your resources. How many data centers do you have and what type of storage will you use, including external cloud storage? Whenever possible, keep it simple. Don't create more rules than are necessary to achieve your objectives. Avoid creating rules that move massive amounts of existing objects whenever possible. One of the most powerful features of ILM is that it applies not only to new objects, but can also apply to existing objects. Finally, use caution when implementing ILM rules that will expire or delete data. Before you create an ILM rule, you'll need to satisfy a few prerequisites. First, you need to configure storage pools. Storage pools define the destination where ILM rules will store your objects. This also applies to cloud storage pools, which define a cloud storage tier that an ILM rule can tier data to. Optionally, if you're using S3 object lock, you'll need to enable this feature on the grid. Every ILM rule has three basic components. It consists of a set of filters that determine to which object will the rule apply, a time component that decides when the rule is applied, and finally the destination where we will store our objects. In this example, our ILM rule will apply if the object size is greater than or equal to 200 kilobyt and has the metadata tag sass set to glacier. The object will be held in data center 1 with a 2 plus one eraser coding scheme for 60 days, then tiered to our AWS Glacier cloud storage pool forever. Now, let's take a look at storage grid and configure some ILM rules. For this demo, I'm using a three-sight storage grid instance. The goals that you have for your grid could include things like, I want to store one copy of my data at every data center, or I want to be able to lose any one data center and still maintain availability to my data. With that in mind, we need to organize our storage resources. To do this, we create storage pools. Storage grid automatically creates one storage pool for every physical data center, but you can also use storage pools to group data centers together. Storage pools also apply to cloud resources. If you're going to use external cloud storage like Azure or Amazon, you'll need to create a cloud storage pool for that resource. Next, if you're going to use S3 object lock, you'll need to enable this feature. Now let's take a look at our ILM rules. Every ILM policy needs a default ILM rule. This will be the rule used if there is no more specific rule that matches an ingested object. A default rule has no filters. This is the default make copies rule. It has no filters and simply makes two copies of your objects on any available storage nodes. Note that if you have a multi-sight grid, you should avoid the default make to copies rule as it may store both copies on the same site. You may choose to create a more specific default rule for your storage grid instance. For example, because I have a three-sight storage grid instance, I've decided to create a three-sight replication rule. Since it is a default rule, it has no filters. It simply makes one copy of my objects on each of my data centers. Now, let's look at a more specific rule for an application. For this example, we'll use fabricpool. My goals for fabric pool are to use eraser coding for efficiency and to store my objects at the same data center as my ONAP system for fast performance. I'm triggering this rule with a filter based on bucket name prefix and object size. Any bucket with the prefix fabric pool or FPDC1 will be stored on data center 1 with a 2 plus one eraser coding profile. Now let's create a new rule as an example. Because I have a three-sight system, I want to use geodistributed eraser coding. Click the create button. Give the rule a name. Write a description to explain what the rule does. I'm going to use 6 + 3 eraser coding on data center 1, data center 2, and data center 3. Now, we can decide if we want to use basic filters. Basic filters gives you the option to use tenant accounts and bucket names to filter rules. In this case, because we're using eraser coding, which is best used for large objects, I'm going to click advanced filtering, add an advanced filter, and add a filter on object size. If the object size is greater than or equal to 1 megabyte, then apply this rule. Additionally, I only want this rule to apply to newly ingested objects. So I'm going to add an additional filter based on ingest time. In this case, we're going to say if ingest time is on or after today, then apply the rule. Optionally, you can also create an or filter by clicking this button. If you define an additional filter here, you create an or clause, meaning that if either this group or this group applies, then the rule will trigger. I'm going to keep it simple and stick with the original set of filters. With my filters defined, click continue. Now we decide where and how we will protect our data. I'm going to leave reference time as ingest time and time period 1 from day zero to forever. Now I select eraser coding. Click this button to select my storage pool. I'm going to use my US data centers pool which includes data center 1, 2, and three. And then 6 plus 3 erase your coding scheme. We can see on our retention diagram that we're going to store all of our data on the US data centers pool with a ratio coding 6 plus 3. When everything looks good, click continue. It's time to determine which data protection option we'll use when our objects are ingested. The default is balanced. Think of this as inline eraser coding. In this case, we immediately apply the data protection policy you've chosen. When my objects are ingested, storage grid will immediately use the 6 plus3 eraser coding profile over my data centers. If one of my data centers is not available, we'll fall back to a twocopy rule. This is the best use case for most objects. The other options are dual commit where we make two copies and then apply the ILM rule later. Think about this as post-process. The other option is strict. Only use strict when the final destination of your objects is more important than availability. with balance set. Click create. Notice that my rule is not used in an active policy. This means that at this point the rule is not being implemented. You must first add your rule to an ILM policy and activate it before it can take effect. A few more things to notice before we move on. Take notice of the compliant column. This means that these rules with yes for compliant provide a sufficient level of data protection to be used with S3 object lock. Note that this column will only appear if you've enabled S3 object lock for the grid. Also notice that you cannot remove or edit a rule that's being used in an active policy. The next step is to add our rules to an ILM policy and activate it. Thank you for watching.

StorageGRID, TechComm TV

Information lifecycle management rules in StorageGRID 11.7

Information Lifecycle Management (ILM) is one of StorageGRID's most powerful features. ILM rules determine where data is stored, how long it is stored, and where it is tiered.

NetApp StorageGRID