Stratos Idreos, Harvard - May 2016


May 20, 2016


Stratos Idreos

Self-designing Data Systems

The architecture of a data system is defined by the chosen data layouts, data access algorithms, data models and data flow methods. Different applications and hardware require a different architecture design for optimal performance (speed, energy). Yet, so far all data systems are static, operating within a single and narrowly defined design space (NoSQL, NewSQL, SQL) and hardware profile. Historically, a new data system architecture requires at least a decade to reach a stable design. However, hardware and applications evolve rapidly and continuously, leaving data-driven applications locked with sub-optimal systems or with systems that simply do not have the desired functionality or the right data model. Our goal in this project is to make it extremely easy to design and test a new data system in a matter of a few hours or days as opposed to several years. Given a data set, a query workload and a hardware profile, a self-designing data system evolves such that its architecture matches the properties of the environment. The whole system design is generated automatically and adaptively by being able to create numerous individual system components that can be combined to synthesize alternative full system designs. A self-designing system continuously performs automatic synthesis of components to evaluate new designs at the lowest levels of database architectures such the data layout, access methods and execution strategies. This research creates opportunities to bootstrap new applications, to automatically create systems tailored for specific scenarios, to minimize system footprint as well as to automatically adapt to new hardware.