June 17, 2007
Lakshmi N. Bairavasundaram, Garth R. Goodson, Shankar Pasupathy, and Jiri Schindler.
This paper presents one of the largest disk-drive studies ever—examining 1.53 million disk drives in the field—for occurrences and characteristics of latent sector errors; it finds that such errors are non-independent events, depend on the type of disk drive, and have spatial and temporal locality.
The reliability measures in today’s disk drive-based storage systems focus predominantly on protecting against complete disk failures. Previous disk reliability studies have analyzed empirical data in an attempt to better understand and predict disk failure rates. Yet, very little is known about the incidence of latent sector errors i.e., errors that go undetected until the corresponding disk sectors are accessed.
Our study analyzes data collected from production storage systems over 32 months across 1.53 million disks (both nearline and enterprise class). We analyze factors that impact latent sector errors, observe trends, and explore their implications on the design of reliability mechanisms in storage systems. To the best of our knowledge, this is the first study of such large scale—our sample size is at least an order of magnitude larger than previously published studies—and the first one to focus specifically on latent sector errors and their implications on the design and reliability of storage systems.
Kenneth C. Sevcik Outstanding Student Paper Award
In Proceedings of the 2007 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS ’07)
The author’s version of the paper is attached to this posting, please observe the following copyright:© ACM, 2007. This is the author’s version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in the Proceedings of the 2007 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS ’07), https://dl.acm.org/citation.cfm?doid=1254882.1254917.
The definitive version of the paper can be found at: https://dl.acm.org/citation.cfm?doid=1254882.1254917