Sanjay Rao, Purdue University - September 2013


December 18, 2013


Sanjay Rao

Workload-aware data placement for multi-cloud architectures

Enterprises are gravitating towards hybrid multi-cloud architectures given conflicting requirements such as data privacy, increasing availability, lowering costs and latencies, and elasticity.Supporting multi-cloud architectures pose many challenges to the storage system with regard to where data must be replicated taking costs, availability and latency into account, how to ensure consistency across replicas, and how to adapt placement decisions to dynamics in application workloads. These challenges are addressed in the this research. Expected contributions include (i) optimization frameworks for configuring replica placement taking costs, latency and availability into account; (ii) frameworks that consider different application consistency models and choose placements accordingly; (iii) algorithms for adapting placement based on workload seasonality and shifts; and (iv) scalable service for locating data items. The focus will be on heterogeneous data placement policies, where placements may be chosen differently across data objects within an application (e.g., different SQL shards, or different key-spaces in key-value stores like Cassandra.