Managing data and computation is at the heart of data center computing. Mismanagement of data and/or computation often leads to data loss, wasted storage as unneeded or redundant data takes up storage space, and laborious bookkeeping. A lack of proper management can result in lost opportunities to reuse common computations or to calculate results incrementally.
Recent advances in distributed execution engines and high level language support have simplified the development of distributed data parallel applications. However, separation of data and computation has limited data center functionalities and data center computing.