Backup and recovery describes the process of creating and storing copies of data that can be used to protect organizations against data loss. This is sometimes referred to as operational recovery. Recovery from a backup typically involves restoring the data to the original location, or to an alternate location where it can be used in place of the lost or damaged data.
A proper backup copy is stored in a separate system or medium, such as tape, from the primary data to protect against the possibility of data loss due to primary hardware or software failure.
Why Backup and Recovery Is Important
The purpose of the backup is to create a copy of data that can be recovered in the event of a primary data failure. Primary data failures can be the result of hardware or software failure, data corruption, or a human-caused event, such as a malicious attack (virus or malware), or accidental deletion of data. Backup copies allow data to be restored from an earlier point in time to help the business recover from an unplanned event.
Storing the copy of the data on separate medium is critical to protect against primary data loss or corruption. This additional medium can be as simple as an external drive or USB stick, or something more substantial, such as a disk storage system, cloud storage container, or tape drive. The alternate medium can be in the same location as the primary data or at a remote location. The possibility of weather-related events may justify having copies of data at remote locations.
For best results, backup copies are made on a consistent, regular basis to minimize the amount data lost between backups. The more time passes between backup copies, the more potential for data loss when recovering from a backup. Retaining multiple copies of data provides the insurance and flexibility to restore to a point in time not affected by data corruption or malicious attacks.
Basic Concepts of Backup and Recovery
There are two general approaches to backup and recovery: streaming backup and array-based backup.
Traditional or Streaming Backup
Streaming backup is the more traditional approach to backup and recovery. Data is read or streamed from the application server or host through a backup server to a secondary medium or storage system. The backup server (or media server) ingests the data and performs additional actions to optimize the data before writing it to the secondary medium. These additional actions commonly include:
- Indexing the data for easy search and restore
- Data reduction (compression, deduplication, etc.)
- Encrypting the data to protect it during transit
Traditional backup and recovery offers many benefits, including:
- The ability to consolidate and manage backup and recovery from multiple primary systems with a single backup interface and storage target
- The ability to reduce data footprint with global deduplication and compression
- Intelligent metadata management to improve data retrieval for recovery
- Application integration to improve the state of the data when restored
Traditional backup and recovery also has some drawbacks, including:
- Performance. Streaming backup places a burden on the application server, which must allocate CPU and memory to read and stream data from local or remote storage to process backup jobs. Even with hardware offload (storage snapshots), change-block tracking is required of the application server to minimize the amount of data replicated for each backup job. With traditional streaming, restoring data must always come from secondary storage, resulting in longer recovery times.
- Complexity. Managing large backup environments can be complex. Dedicated backup teams are often hired to manage all the data sources that must be protected. In today’s hybrid cloud reality, keeping track of mobile or remote data sources can increase that complexity.
- Cost. Enterprise backup software can be very expensive, especially as data grows. In many cases, the cost of backup software can be greater than that of the underlying storage infrastructure. Deduplication backup appliances offer the benefit of excellent data reduction, but it comes at a premium, and the appliance may represent the highest storage capacity in the data center.
Array-Based Backup and Recovery
Array-based backup and recovery built on storage snapshots offers an alternative approach to protecting data. There are several benefits to this approach:
- Performance. High-performance snapshots on primary storage create local recovery points with low impact to the production workload, enabling higher service levels with shorter backup windows and more frequent recovery points. Local snapshots offer rapid recovery when compared to streaming backup because the snapshot is already on the primary storage and doesn’t need to be retrieved from secondary media. Array-based data replication eliminates the requirement to stream data through the application server or virtual host.
- Efficiency. Data efficiencies are retained from primary storage to secondary storage.
- Data reuse. Data transferred to secondary storage is not in a proprietary format, making it easier and faster to use for other purposes like instant DR failover and test/dev workflows.
Operational Recovery vs. Disaster Recovery
Many organizations use a backup copy of data to recover from a disaster. However, the terms backup and recovery and disaster recovery generally have different meanings.
Operational recovery. A backup copy is intended to help bring systems online by replacing the primary data with an earlier copy in the event of data loss or corruption. Backup data includes multiple points in time that can go back several years and can be kept as an archive. The older the data, the less likely it will be used to restore from. However, many industries have regulatory requirements that require retaining data for long periods of time. Backup or archive copies are well suited to address these requirements.
Disaster recovery. A disaster recovery solution is intended to help an organization rapidly recover from a hardware or site outage by failing over to another copy of data in a near-consistent state. In the event of hardware, software, or site failure, applications are brought online at a secondary site within a short period of time to maintain business operations. Once the primary location is back online, applications and data are typically restored back to the primary location. Unlike backups, a disaster recovery solution typically is made of mirrored data, resulting in minimal data loss (minutes rather than hours or days).
Backup and recovery and disaster recovery aren’t mutually exclusive. In fact, best practices include both approaches. Some technology vendors may offer a unified approach, supporting both backup and disaster recovery. A single replication environment may include both mirroring of primary data and taking snapshots to preserve earlier points in time. This approach is cost effective, and it’s also practical. For instance, a virus incurred on primary storage is likely to infect a secondary mirror. Having point-in-time copies on the secondary storage allows failover to or recovery from a time when the data was in a clean state.
What Does NetApp Offer in Terms of Backup and Recovery?
NetApp offers a portfolio of data protection solutions to help our customers protect their most critical data, on premises or in the cloud. Our flagship product, NetApp® ONTAP®, includes integrated data protection features to support high speed, efficient backup and recovery, disaster recovery, and business continuity across clouds. In addition to our own software solutions, NetApp partners with leading backup software vendors to provide flexibility and versatility for diverse IT environments.