1. Field of the Present Invention
The present invention generally relates to the field of data processing networks and more particularly to a network and method for improving the time required to boot multiple client systems from a boot image server.
2. History of Related Art
Network “booting” is well known process by which an initial software image is loaded into a data processing system (referred to as a client system or simply client) from a network server over the network. Historically, network booting was achieved using basic I/O system (BIOS) extensions located in special memory modules on network adapter cards. These special memory modules, informally referred to as “boot ROM's,” implemented protocols such as Remote Initial Program Load (RIPL). More recently, client systems are typically provided with firmware designed to load a pre-boot execution environment (PXE). PXE uses a combination of Dynamic Host Configuration Protocol (DHCP) and Trivial File Transfer Protocol (TFTP) to locate a server on the network, assign an address to the client system requesting a boot image, and to provide the boot image to the client. In a typical PXE session, the client system initiates the protocol by broadcasting a DHCP “discover” request containing an extension that identifies the request as coming from a PXE-enabled client. Assuming that a DHCP server (or a Proxy DHCP server) implementing this extended protocol is available, the available server sends the client a list of appropriate Boot Servers. The client then discovers a Boot Server of the type selected and receives the name of an executable file on the chosen Boot Server. The client then uses TFTP to download the executable file from the Boot Server in multiple 512-byte data packets within the context of a Transmission Control Protocol (TCP) session. Finally, the client initiates execution of the downloaded image.
Depending upon the circumstances, a large number of client systems may issue simultaneous requests to the server for a boot image. In a server cluster configuration, for example, multiple network servers are installed in a single chassis or rack. All servers in the rack typically share some resources such as power. When the rack is powered on, all servers in the rack are powered on more or less simultaneously thereby resulting in multiple substantially simultaneous requests for boot images. (It should be noted that these network servers are the client systems for purposes of requesting and obtaining a boot image over the network.) In an IBM xSeries Server Blade implementation, as an example, as many as 84 server blades may be installed in a single rack. If none of these blades has a local, persistent mass storage device, also referred to as a direct access storage device (DASD), all 84 servers may request a boot image at approximately the same time when power is restored. Because the network boot procedure described executes within the context of a TCP session in which each data packet must be acknowledged before the next packet is sent, it will be appreciated by those familiar with data processing networks and network protocols that the described boot scenario could saturate the network bandwidth thereby resulting in a slow boot process. The problem could be further exacerbated in a power outage scenario in which an entire room of racks could be powered on simultaneously following a black-out.
Server clusters and similar networked configurations are frequently used as data centers that are crucial for conducting business. It is therefore desirable to bring such networked systems up in the fastest possible time. One measurement of the efficiency of the image downloading process is referred to as the Average Download Time (ADT). As its name implies, the ADT represents the average amount of time that systems wait to download their images from the boot server. ADT is influenced largely by the size of the image, the available network bandwidth, and the number of clients accessing the server at once. The ADT is computed by summing the duration of each client's TCP session and dividing by the number of clients.
Boot servers must decide the order in which to handled packet requests from the client systems. Ideally, the boot servers attempts to fully utilize the available network bandwidth. Historically, requests for data packets would be scheduled according to a FIFO algorithm or in round-robin fashion. Unfortunately, it is well known in the industry that FIFO and round-robin selection schemes are not optimally efficient. Accordingly, it would be highly desirable to implement a network method and system that employed a strategy for booting client systems on the network in an optimal fashion.