1. Field of the Invention
The present invention relates to an improved data processing system and method for coordinating the operation of multiple computing devices, and in particular, to a method and system for initializing computers and other computing devices through a computer communications network.
2. Description of Related Art
A modern distributed computing environment in a moderate to large enterprise consists of many computing devices that communicate with each other through a computer communications network. There are two groups of devices, called clients and servers, which perform the computational tasks associated with the defined purposes, or missions, of the enterprise. Clients request centralized data processing services from servers when performing computational tasks. Servers supply those requested services. Generally, there are more clients than servers because servers are typically larger machines that can each service the requests of many clients.
Clients are usually operated by people who are end-users of the enterprise computing environment. Each end-user has a role in the enterprise which requires access to a subset of the computational tasks associated with the missions of the enterprise. End-users with different roles require access to different subsets of computational tasks. It is important that each end-user have rapid and easy access to the appropriate subset of computational tasks associated with that end-user's role. It is also important that each end-user not have access to computational tasks that are not associated with that end-user's role. By limiting end-user access in this way, end-users are prevented from causing inadvertent or deliberate damage to the enterprise computing environment.
Servers are operated by people who are administrators of the enterprise computing environment. Administrators have roles that assure that the enterprise computing environment is available to end-users with a minimum specified quality of service. The computational tasks associated with administrator roles are therefore associated with the availability of the enterprise computing environment and not necessarily directly associated with the missions of the enterprise. In addition to operating servers, administrators are responsible for the installation, configuration and maintenance of the entire enterprise computing environment, including servers, networks, and clients. An important responsibility of administrators is to define the software configuration of each client so that it matches the access requirements of the end-user who is operating the client.
The integration of clients and servers into distributed computing environments has provided benefits to enterprises by making data more available when and where it is needed. The productivity of end-users has been increased by significantly reducing manual handling and processing of data that is required to make the data useful to the enterprise. Moreover, client-server environments have made it possible to use this data as a tool to improve strategic decision making, and it has permitted enterprises to take advantage of the decreasing unit cost of computing by distributing data processing to newer devices.
The increasing complexity of distributed computing environments has also increased the costs of administering these environments. These increasing administrative costs offset the benefits described above. In fact, as the unit cost of computing devices has decreased, these administrative costs are responsible for an increasing proportion of the total cost of ownership of data processing resources. This has made these administrative costs a target for increasing cost-containment efforts by enterprises. End-user client devices contribute a significant and increasing share of these administrative costs because they are the most numerous, most functionally diverse, most physically scattered, and most vulnerable of the computing resources.
The concept of the server-managed client has been introduced as a means of controlling the administrative costs of these clients. The implementation of this concept permits administrators to define the client software environment using resources available on centrally located servers rather than having to physically visit and configure each client separately. These server resources include files that are stored on servers and that are copied through the network by clients. These transferred files include program files that contain the client software instructions that execute on the client and data files which define the enterprise computing environment for that client software. These server resources also include administrative software running on servers to automate the creation and management of client software environment definitions.
The implementation of server-managed clients is made possible with a remote boot process that is provided to the client. A boot process on a client is defined as a sequence of program instructions that begins automatically when the client is started or reset and completes when an end-user software environment is operational on the client. The initial instructions that are executed in a boot process are fixed in the nonvolatile memory of the hardware of the client. As the boot process progresses, program instructions are located on a source outside of the client's nonvolatile memory and copied into the client's volatile memory (also referred to as dynamic or random access memory). Client execution is then transferred from nonvolatile memory to these instructions in volatile memory. Those instructions in volatile memory continue the boot process by locating and copying additional program instructions and data into the client until the end-user software environment is operational.
In a remote boot process (also called a network boot process) some or all of the program instructions and data are copied to the client's volatile memory by requesting and receiving files from a specified server, called a boot server, over a network through the client's network interface device. This is distinguished from a local boot process where the source of the program instructions and data is nonvolatile medium residing in a device that is attached to the client, such as a diskette, hard disk, or CD-ROM. A remote boot process allows end-user software environments to be located in a repository on a centrally-located boot server instead of having to be transported on a separate physical media to the location of every client.
The server-managed client concept has administrative benefits that go well beyond those associated with the initial deployment of a client. Updates, fixes, or changes to client operating systems and application programs can be applied to the client files where they are stored on the servers. Those changes can then be deployed to all clients automatically using the remote boot process with no administrator or end-user intervention required except to initiate the remote boot by restarting or resetting each client. By assuring the consistency of the client machine software environments in this manner, the incidence and impact of software-related problems is reduced, thereby reducing the cost and complexity of diagnosing and rectifying client-side problems.
Multiple client operating systems can be supported to meet application needs, end-user preferences, or hardware compatibility issues. Access to a client machine's local hard drive can be restricted to force all end-user generated data to be stored on a server, ensuring that such critical enterprise data is always available. End-user authentication and authorization processes can be centralized and simplified.
Separate classes of client desktop interfaces can be deployed for each class of end-user, or an administrator can have the ability to define customized desktop environments, including a set of specific authorized applications for each end-user in a domain. More dynamically, end-users can have “roaming” desktops. When an end-user logs on to a client machine, the end-user's desktop and applications are supplied from the server, giving the end-user the ability to log on to any client machine in the domain and see the same desktop and applications. This capability is particularly useful in environments in which end-users do not always work at an assigned workstation but move between workstations based on availability, such as call centers, banks, or airline departure gates.
The server-managed client architecture also increases the reliance of clients upon the boot server for their ability to operate. By extension, the mission-critical computational tasks of the enterprise are also more reliant upon having the boot server maintain a minimum quality of service. For instance, during failure recovery after a power failure or some other type of widespread system outage, a large number of clients will need to be remote booted almost simultaneously. In some environments, the distributed computing environment needs to assure that the clients can complete the remote boot process within a specified time constraint, thereby imposing both availability and performance constraints on the remote boot infrastructure. A fault-tolerant, performance-sensitive solution would ensure that the clients can complete the remote boot process with a minimum required quality of service over a wide range of operating conditions within the remote boot infrastructure.
Therefore, it would be advantageous to provide a method and system for a fault-tolerant remote boot solution that can dynamically respond to changes in the quality of service provided by each of multiple redundant boot servers. It would be particularly advantageous for the method and system to throttle off responses that could direct clients to boot servers that have failed or that currently have low or unacceptable qualities of service. It would also be advantageous to avoid severe client boot fault conditions that could arise from interrupting the remote boot of any client that had already been directed to a boot server.