Large-scale data processing systems frequently rely on banks of similar or identical machines to provide infrastructure services, rather than requiring each computer in the system to operate autonomously by including all of its own infrastructure. For example, data storage functionality is often consolidated into a server or data server array; this can facilitate efficient backup strategies and permit an arbitrary number of data processing systems to access and operate on a common data set. (An alternative to this arrangement might involve providing some of the overall data storage space required at each processing system, but this approach may complicate application design by requiring software to distinguish between data stored locally and data stored on other peer systems. Successful backup strategies may be more difficult to implement, and the failure of an individual system may impact the work of other systems that needed access to the data stored on the failed system.)
Of course, shifting an infrastructure function such as data storage onto a bank of special-purpose machines gives rise to a different set of challenges, even as it alleviates some difficulties of managing a large-scale system. Important among these challenges is the task of configuring the bank of machines so that each operates as intended to provide the infrastructure function. Even special-purpose machines that only provide a limited range of services may have complex configuration requirements, particularly when the machines are installed at diverse geographical locations to provide redundancy and/or to take advantage of services or facilities with locality-dependent aspects.
Groups of servers that are to work together to provide a basic data processing service with redundancy and high availability may require consistent and coordinated, but not identical, configurations. As a simple example, consider two data storage servers (e.g., “fileservers”) that are to store data for client systems. These servers may need identical configurations to control access from remote clients, but they may need different network communication configurations because they are connected to a distributed data network through two different circuits.
Current system management procedures generally take an ad hoc approach to managing similar systems. For example, an administrator may keep copies of various generic configurations in a library, and prepare a configuration for a new system based on the closest generic configuration. However, once a generic configuration is customized and deployed, there is often no way to update a common parameter in all the configurations of servers in the farm without connecting to each machine in turn and making the modification—a time-consuming and error-prone task.
Therefore, a better method of configuring and managing many similar (but not necessarily identical) data processing systems efficiently may be of value in this field.