October 26, 2011
Yanpei Chen, Kiran Srinivasan, Garth Goodson, and Randy Katz.
In this paper, we provide future storage system insights by using a new methodology that leverages an objective, multi-dimensional statistical technique to extract data access patterns from network storage system traces.
Enterprise storage systems are facing enormous challenges due to increasing growth and heterogeneity of the data stored. Designing future storage systems requires comprehensive insights that existing trace analysis methods are ill-equipped to supply. In this paper, we seek to provide such insights by using a new methodology that leverages an objective, multi-dimensional statistical technique to extract data access patterns from network storage system traces. We apply our method on two large-scale real-world production network storage system traces to obtain comprehensive access patterns and design insights at user, application, file, and directory levels. We derive simple, easily implementable, threshold-based design optimizations that enable efficient data placement and capacity optimization strategies for servers, consolidation policies for clients, and improved caching performance for both.
In Proceedings of the ACM Symposium on Operating Systems Principles 2011 (SOSP’11)
The author's version of the paper is attached to this posting, please observe the following copyright:© ACM, 2011. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in the Proceedings of the ACM Symposium on Operating Systems Principles 2011 (SOSP '11) https://doi.acm.org/10.1145/2043556.2043562
The definitive version of the paper can be found at: https://doi.acm.org/10.1145/2025113.2025121