NetApp Tech OnTap
     

Can You Benefit from Integrated Data Protection?

Comprehensive data protection requires a lot more than just backup and restore capabilities. To protect the accessibility of critical business data, you need to consider:

  • High availability
  • Disaster recovery
  • Business continuance
  • Archive and compliance

You probably rely on a variety of solutions to deliver these functions. You may choose one vendor for high availability, a second for backup and restore, another for business continuance, and yet another for archiving. The results are high management complexity and significant expense.

An integrated data protection strategy mitigates these complexities by moving key data protection functions into storage, allowing you to target more aggressive recovery point objectives (RPOs) and recovery time objectives (RTOs) while reducing cost and minimizing management overhead.

This article explains what integrated data protection is and compares the process flow with traditional approaches to data protection in terms of identifying new and changed files and moving and storing data. It also discusses the value of integrated data protection for critical applications.

What Is Integrated Data Protection?

The traditional approach to data protection is expensive and fails to take full advantage of storage system capabilities. Data protection applications just sit on top of your server operating system and copy data to secondary storage. These applications do not use special data capture and movement functions built into underlying storage, resulting in slow operations that constrain your RPO and RTO and potentially impact other activities.

Integrated data protection takes a more efficient approach, while leaving data in a format where it can be used for other purposes. With integrated data protection, high-availability (HA), backup, and compliance functions are embedded within the storage system. All functions work together, and you can use the same data set for multiple data protection purposes and requirements. This approach provides several advantages:

  • Key data protection functions are sourced from a single provider, simplifying implementation and ongoing management while increasing interoperability.
  • Full advantage is taken of storage capabilities to provide better performance and greater functionality.
  • Your data copies can be used for other tasks for faster return on investment.

Tracing the data protection flow makes these differences clear.

Comparing Data Protection Approaches

Almost any data protection operation—backup, replication, or archive—includes a set of common activities:

  • Identifying changed or new data
  • Moving data
  • Storing data

Identifying Changed or New Data
Before data movement can begin in a traditional data protection operation such as backup, you first need to identify all new or changed files. In a traditional backup process this typically is a time-consuming file collection process involving a “file system walk.” The timestamp on each file must be compared to the time of the last backup to identify files that have changed and build a file catalog before any actual movement of data can occur. For very large file systems this process can take 10s of minutes, even hours, to complete.

The integrated data protection alternative is to leverage snapshot technology within the storage system. Snapshots avoid much of this time-consuming process by immediately capturing an index of pointers to the specific data blocks (not full files) that have changed since the previous snapshot, enabling data movement to start much faster.

Moving Data
Ability to move data is closely tied to the ability to identify data. Traditional technologies simply identify a file that has changed and copy the entire file, even when only a single block has been altered, consuming large amounts of network bandwidth and secondary storage. (Target deduplication has become popular to reduce the amount of storage traditional methods consume but does nothing to improve RPO or RTO.)

Integrated data protection transfers only the pointer map and changed blocks captured by the snapshot process, making it faster and more efficient. From a network bandwidth perspective, this method is especially helpful for companies that immediately replicate backups off-site (for example, as when using a remote to core strategy) and need to minimize the overhead on expensive WAN connections. (An OC-3 delivering 19.4 MBps costs around $27,000/year.)

Storing Data
When it comes to storing data, most solutions utilize a proprietary format that makes it difficult or impossible to use the data for any purpose other than recovery. When backups were always stored on tape, this did not matter that much because a backup application was required to interact with the tape changer and drives.

Integrated data protection should store data copies in an open file system format. This provides several advantages:

  • Users can easily locate and recover their own files without administrator intervention.
  • You can leverage your backup data for dev/test, data mining, disaster recovery, compliance, and so on.

With traditional data protection, the goal is to limit the potential for backup data to be modified in a way that would make recovery impossible. Integrated data protection can guarantee that backup data is read-only and that it must be split off for read/write operations.

For example, NetApp® FlexClone® technology enables secondary data copies to be used for other purposes without making a full copy of the data set. These writeable “thin clones” only consume additional storage space as changes are made, so they are highly space efficient, allowing you to get much more out of secondary storage used for disk-based backups or replication.

Stored data does not have to be data at rest. With integrated data protection, backup images are not simply stored and left on disk or tape to collect dust. They can be used for other business functions and to extend the data protection chain. Backup images can be replicated for disaster recovery and then locked for compliance, all without having to manage disparate applications or run multiple, resource-intensive processes on a server, which can impact performance of business applications.

Protecting and Recovering Application Data

Applications that run continuously introduce special requirements for data protection. Because most applications cache data in memory for performance reasons, you can’t simply copy an application’s on-disk data and be certain that it will be in a consistent, up-to-date state. For this reason, many commercial applications and databases provide a hot backup mode that allows you to create consistent backups or copies without halting the application. Because hot backup mode results in a performance penalty to the running application, you need to either run the operation when the application is lightly loaded (not always possible) or get the operation done very quickly.

Because traditional data protection solutions—whether they use tape or disk—typically take significant time to complete, they usually have to be run during off-peak hours.

With integrated data protection, in contrast, you can put the application in hot backup mode, create a snapshot, and return to normal operation in a matter of minutes. This offers significant advantages:

  • You can create many snapshots throughout the day to provide far more recovery points than would be possible with other solutions.
  • Once you have a consistent snapshot, you can retain it on primary storage for immediate recovery, copy it to secondary storage, or replicate it to another site for disaster recovery.

Recovering Applications
Traditional application-aware backups leave data in a consistent state such that the application can restart at the point the backup was made. When it becomes necessary to recover an application, such as a database, from traditional backups, you first restore the most recent backup, then you replay transaction logs until the database is up-to-date. The chance of making a mistake or running into other problems that cause delay can be high with this multistep process.

When you initiate an application recovery with integrated data protection, you can automatically recover the application to the time of the failure without any manual intervention. The software carries out all steps necessary to recover to a designated point in time without intervention, saving time and eliminating the possibility of user error.

NetApp Solutions

NetApp offers a suite of integrated data protection solutions built on top of our space-efficient Snapshot™ technology. Using NetApp SnapVault® software, you can back up your Snapshot copies to secondary storage for longer term, online retention. Only changed blocks are transferred for efficient use of network bandwidth. Storage on standard Windows®, Linux®, and UNIX® servers can also be included using Open Systems SnapVault.

For disaster recovery, NetApp provides efficient replication with SnapMirror®, which also replicates only changed blocks. For enhanced data protection within a data center or campus, NetApp MetroCluster provides continuous data availability with synchronous mirroring for the most critical applications. A companion article in this issue provides a case study of MetroCluster deployment.

Application backup is provided by the SnapManager® suite, which integrates with popular applications including Oracle®, SAP®, Microsoft® Exchange, SQL Server®, SharePoint®, and VMware®. These tools can put the application in hot backup mode, capture a consistent Snapshot copy, and then resume normal operation in seconds, serving as a foundation for both backup and disaster recovery. Specific storage management capabilities can be delegated directly to application administrators for improved efficiency.

For archival and compliance needs, NetApp SnapLock® allows you to convert any NetApp volume on primary or secondary storage into nonrewritable, nonerasable storage to prevent files from being altered or deleted until a set retention date. Figure 1 illustrates the full scope of NetApp integrated data protection.

Figure 1) The full range of NetApp integrated data protection solutions.

Conclusion

By allowing you to:

  • Immediately identify new or changed data requiring backup or replication
  • Move data efficiently
  • Store data efficiently in an open format
  • Perform application-consistent backup for fast recovery

Integrated data protection offers significant advantages over traditional data protection solutions. Almost any data center can benefit from this fast and fully integrated approach to data protection. Busy transaction-processing operations in particular can gain significant advantages from integrated data protection through the creation of more frequent recovery points and ability to recover faster.

Integrated data protection also fits nicely with the current trend toward cloud computing. The ability to offer a consistent set of easy-to-manage and scalable data protection services to either your internal or external customers is a key part of a fully realized cloud infrastructure.

NetApp provides a full suite of integrated data protection capabilities that take the complexity and cost out of comprehensive data protection.

Got opinions about Integrated Data Protection?

Ask questions, exchange ideas, and share your thoughts online in NetApp communities.

Joshua Konkle

David A. Chapa
Director, Backup and Recovery Solutions
NetApp

David has over 20 years of industry experience focusing specifically on data availability, disaster recovery, and business resumption practices. He is coauthor of Implementing Backup and Recovery: The Readiness Guide for the Enterprise and is recognized as an authority on backup and recovery, disaster recovery, and business resumption.

Joshua Konkle

Nathan Moffitt
Senior Manager, Backup and Recovery Solutions
NetApp

Nathan has over 13 years of IT industry experience spanning server, storage, networking, and data protection technologies. In addition to designing and implementing solutions used worldwide by Fortune 500 companies, he is the author of a variety of articles on data protection and shared file systems.

 
Explore