Cloud computing has received significant attention lately as a means to process large data sets, yet people still prefer to manage data on their local desktop machine. While the cloud offers the ability to scale, the desktop offers numerous practical advantages such as straightforward debugging of program logic, availability of useful tools like spreadsheets, and in general offers more convenience and autonomy compared with timeshared cloud environments. Hence, a standard practice for dealing with large data sets is to process them initially in the cloud and, as soon as sufficient data reduction has occurred, to migrate the data to the desktop for exploration and analysis.
Unfortunately, there is a significant amount of labor involved in managing data and logic in both environments, staging it back and forth, dealing with bugs that arise in one environment but not the other, and dividing processing into appropriate cloud-side and desktop-side components.