November 14, 2014
Rukma Talwadker and Kaladhar Voruganti
Historically, traces have been used by system designers for designing and testing their systems. However, traces are becoming very large and difficult to store and manage.
Thus, the area of creating models based on traces is gaining traction. Prior art in trace modeling has primarily dealt with modeling block traces, and file/NAS traces collected from virtualized clients which are essentially block I/O’s to the storage server. No prior art exists in modeling file traces. Modeling file traces is difficult because of the presence of meta-data operations and the statefulness NFS operation semantics. In this paper we present an algorithm and a unified framework that models and replays NFS as well SAN workloads. Typically, trace modeling is a resource intensive process where multiple passes are made over the entire trace.
In this paper, in addition to being able to model the intricacies of the NFS protocol, we provide an algorithm that is efficient with respect to its resource consumption needs by using a Bloom Filter based sampling technique. We have verified our trace modeling algorithm on real customer traces and show that our modeling error is quite low. In Proceedings of the USENIX Large Installation System Administration (LISA’14)
A link to the paper can be found at: https://www.usenix.org/conference/lisa14/conference-program/presentation/talwadker