June 15, 2012
Atish Kathpal, Mandar Kulkarni, and Ajay Bakre.
In this paper, we develop cost metrics that allow us to compare storage vs. compute costs and suggest when a transcoding on-the-fly solution can be cost-effective.
Video content is quite unique from its storage footprint perspective. In a video distribution environment, a master video file needs to be transcoded into different resolutions, bitrates, codecs and containers to enable distribution to a wide variety of devices and media players over different kinds of networks. Our experiments show that when 8 master videos are transcoded into most popular 376 formats (derived from 8 resolutions and 6 containers), transcoded versions occupy 8 times more storage than the master video. One major challenge with efficiently storing such content is that traditional de-duplication algorithms cannot detect significant duplication between any 2 versions. Transcoding on-the-fly is a technique in which a distribution copy is created only when requested by a user. This technique saves storage but at the expense of extra compute cost and latency resulting from transcoding after a user request is received. In this paper we develop cost metrics that allow us to compare storage vs. compute costs and suggest when a transcoding on-the-fly solution can be cost effective. We also analyze how such a solution can be deployed in a practical storage system using access pattern information or a variant of ski-rent  online algorithm when such information is not available.In Proceedings of the USENIX Workshop on Hot Topics in Storage and File Systems 2012 (HotStorage ’12)
A copy of the paper is attached to this posting. Link to presentation slides and audio https://www.usenix.org/conference/hotstorage12/workshop-program/presentation/kathpal