In large-scale systems, a trend in software deployment is to centralize Operating System (OS) software image management on a globally accessible file system with stateless computing nodes. The compute nodes are activated with the distributed application by either diskless booting protocols or remote software installation to local storage.
One approach to software image management is to use a single image with global process dispatch to lightweight OS node computing environments. A second approach is to dedicate one image per compute node.
Neither of the two general approaches provides fault isolation between different instances of the distributed application while being scalable and efficient. For example, the single OS image approach does not provide fault isolation between different instances of the distributed application.
Moreover, a single OS image results in a network bottleneck that becomes worse as the size of the system grows. Typically, clients send a huge number of requests over the network to the master node. In some conventional systems the client node must send a request over the network to obtain a file name even if no file data is presently desired. Moreover, in some conventional systems, if the server is down, the client is essentially unable to continue to run the distributed application.