Some networks are arranged to quickly and reliably transfer large amounts of file data to a high number of machines simultaneously. Sometimes the machines to be involved in the transfer are not all known, the exact time when they will request the transfer is not known, and the file data requested may be different from machine to machine but is known that a fair number of machines will request the same (large) set of file data. Such networks may include software test labs, where a high number of test machines need to receive disk images from the main file server.
One method for transferring files in such a network is to have all the machines request the file data from the central file server. In such a centralized distribution structure, performance may drop markedly after a number of machines initiate requests and decline rapidly as more machines make request. At some point, the requests may actually begin to timeout and fail. The file server becomes at some point constrained by disk, CPU, or network bandwidth. Pushing the server past these limits results in poor performance and error conditions. Solutions involving purchasing more file servers have the drawbacks of increased hardware costs, data replication between the servers, load balancing, and additional hardware to maintain.
Another possible solution is to attempt to use broadcast or multicast network operations to distribute the file data. Broadcast may not be appropriate, as there are a number of sets of distinct file data to be distributed to a number of sets of different machines, which is outside the application of broadcast networking. Multicast networking may be a better fit to the problem, but has the limitations of lack of multicast file transfer software and not knowing the set of machines to transfer to at the beginning of the transfer.