It is increasingly common for households to own more than one computer, and for these machines to be on a home network. Each of these computers has a substantial amount of computing power, memory, and storage capacity. Current popular operating systems allow you to use the resources of the local machine very easily, but it is difficult to make use of the resources of other machines on the network. Hence, when running a compute-intensive application the CPUs of remote machines are typically idle, and their memory is not accessible. Similarly, when running an I/O intensive application the local disk may be 100% utilized, but other storage devices are idle.
Home computational needs are increasing. For example, video editing and transcoding is increasingly common and can take hours to complete. Even with a high-end home computing system, tasks can take a long time to complete. On the other hand, most homes have multiple computers in the form of desktops, laptops, home theater PCs (HTPCs), as well as non-traditional computing devices that contain common computing hardware such as game consoles, mobile phones, and embedded devices (e.g., set top boxes, routers, and other equipment).
In datacenter settings, distributed systems software makes it possible to spread data across multiple storage devices, and to run computations in a parallel fashion across multiple machines. Data centers are often very homogeneous, meaning that each machine has a similar processor, amount of memory, network bandwidth, and other resources. Scheduling algorithms used in datacenters are typically greedy and do not consider machine differences, but rather quantity and availability of machines when scheduling. Various companies provide data-parallel frameworks for spreading job execution to multiple computers in data centers, such as MICROSOFT™ Dryad, Google MapReduce, and Yahoo! Hadoop. Some of these also provide toolsets and programming languages to make parallel computing easier, such as MICROSOFT™ DryadLINQ.
Similar functionality is increasingly useful in a home/personal setting, but much more complex due to issues of heterogeneity, connectivity, power management, and software version/update management. Assumptions that are valid in data centers completely fail in home environments. For example, home computers may come and go as a user takes a laptop or other mobile device into and out of the house. Home computers may use a variety of connection types for networking, such as a Wi-Fi connection when a user is roaming around the house and a wired Ethernet connection when the user docks a laptop. Home computers may also go to sleep or run out of battery power. Finally, home computers include a mish-mash of processors, graphical processing capabilities, memory quantity, disk space, disk speed, and so forth. These differences contribute to distributed computing being virtually non-existent in the home or other small cluster settings.