Date
December 18, 2013
Author
Luis Ceze and Mark Oskin
Irregular applications are increasing in importance to the computing industry. For a long time they underpinned computations of interest to the national security arm of the federal government, but now they support such activities as advertisement placement in social networks and analysis of complex datasets in medicine and science. The defining characteristics of these applications are poor locality and massive available parallelism. Professor Ceze's effort is focused on making these applications perform well and scale on commodity hardware. The key idea that makes this work is using massive concurrency to mask memory latency instead of relying on locality, which doesn’t exist for these applications. This core idea underpinned the Cray XMT system, a fully-custom hardware system. This project will explore that idea in software only, running on commodity processors and commodity networking hardware. To overcome the bandwidth limitations of transmitting small messages, again parallelism will be exploited, by using threading to provide a sufficient number of buffered requests to coalesce and dispatch as a unit. Preliminary effort has shown a workable proof-of-concept of this approach. A commodity cluster was able to beat the XMT at its own game on some applications by emulating its key features in software. What is needed next is additional effort to improve the underlying performance of this runtime and make it accessible to a broad base of users. Now efforts will be focused on the key performance component, the networking layer, and the key programmer efficiency linchpin, high-level language support. In addition, a useful platform for experimenting with datasets in the petabyte range is needed, likely a cluster of machines with large numbers of solid-state disks.