Computer systems have become enormous in size compared to systems of merely a decade ago. It is not uncommon to have hundreds, if not thousands, of work stations or users that communicate over a network. This inevitably implies a distributed architecture in which certain computing functions, such as mass data storage, printing, processing, telecommunications and other data processing services are dispersed among physically distinct computing elements connected to the network at locations remote from the work stations. Thus, for example, instead of having a local device, which is directly connected to or forms part of a particular work station, to perform a function, such as file storage, the system includes a remote storage device which may be shared by all work stations. This not can only provide a cost-effective approach to system design, but also facilitates sharing of data files by a number of users.
Problems arise, however, as the number of work stations increase, in the mechanisms used to control communications among the work stations and the remote devices over the network. Typically, the number of potential users which the remote device can serve is severely limited. This limitation can be more fully appreciated by examining the way in which communications between a work station and a remote device take place. For this purpose, a remote mass storage device will be utilized as an example of the broad class of remote devices which can be attached to the distributed network.
Generally, on a networked system, a remote mass storage device provides shared data storage space or centralized file storage which is accessible to a user connected to the network. The user, sometimes referred to as a client, may be a terminal, a work station, an applications process or another computer. Typically, a file server is provided as an interface between the mass storage device and the network. A network may have many types of servers, each server carries out certain functions on behalf of a client relating to the type of shared resource, such as the printers, telecommunications links, and so forth, that may be connected to it. A file server, for example, controls one or more mass storage devices and coordinates client requests for access to data files sorted on them. When a client requests a specific file located on a remote mass storage device, the server receives the request, identifies which mass storage device contains the file, passes the request to the device, and transfers the retrieved file to the client. For each client seeking access to files on the mass storage device, the server performs this same basic set of functions.
In a distributed system, since the file server and its remote device are neither directly nor physically connected to a client, the client communicates with the server over the network by establishing a virtual circuit effectively allowing it to communication with the device. The virtual circuit, also known as a logical link, is essentially a channel over the network dedicated to handling only the communications between the client and the specific remote device. The virtual circuit effectively makes the remote device appear to the client as though it were a local device physically connected directly to the client and is not shared with other clients. Indeed, as long as the virtual circuit exists, the client has access to the remote device as though it were a local device. Accordingly, the client can open and close files on the remote device as well as read, write, seek on any file located on the remote device.
Typically, a client establishes a virtual circuit with a server by an interchange of messages that essentially divide into two phases. In the first phase, the client identifies and locates the server and the remote device, the client transmits a message to the server over the network that identifies a local name for the remote device, that is, the name by which the remote device is known is generally known in the system. In the second phase, the client and server create the virtual circuit. Since the execution time required to establish the virtual circuit can be large in comparison to the time required to transmit a single message over the virtual circuit, typically the virtual circuit is maintained on a permanent basis; that is, after the virtual circuit is established, it remains until either the client explicitly terminates it be means of another interchange of messages.
While the virtual circuit is being established, both the client and the server create a data structures on their respective sides of the connection. Each data structure contains the identity and location of the device on the other end of the connection as well as information relating to the circuit and rules for communicating over the circuit. The data structures are necessary to maintain the virtual circuit and remain stored in client or server memory for as long as the virtual circuit exists. For each virtual circuit over which a client and a file server communicate there is an associated set of data structures, one on each end of the connection. The process of maintaining the data structure on the server side of the connection is commonly known as "maintaining state." A server which maintains state can and often does provide these protections against interference.
After establishing a virtual circuit, the client gains access to a file on the remote device by opening a session over the circuit. As part of opening a session, both the client and the server create session data structures on their respective sides of the virtual circuit. Typically the session structure on the client side identifies the file, the file operation, the name of the remote device and the virtual circuit, as well as the identity and other information about the client process which has requested the file access. On the server side of the circuit, the session structure typically identifies the client, the file, the status of the file and the number of file operation requests outstanding. For each file being accessed, there is a associated set of session structures, one on each side of the circuit. In addition, both the client and the server add information to their respective virtual circuit data structures identifying the session data structures associated with the sessions that are using the virtual circuit.
As each session is concluded, both sides destroy the associated session structures. When all of the sessions over a particular virtual circuit are closed, then only the original data structure for the virtual circuit remain. The virtual circuit effectively remains intact, awaiting the opening of another session over it. Thus, as noted above, the server maintains state to assure the continued existence of the virtual circuit. An advantage of maintaining state is that the client can quickly and easily access the server whenever it needs to open another file. Since the virtual circuit data structures remain even after the client has ended all sessions over the circuit, the remote device continues to function as though it were a local deviceand is immediately available to handle the next client session. The time-consuming step of having to re-establish the virtual circuit is eliminated. From the client's viewpoint, this improves efficiency and reduces processing delays associated with having to reestablish the connection with the server.
For this advantage, however, the distributed system pays a significant price. Maintaining state requires a commitment of memory on the server to preserve the relevant data structures. In a distributed system which has thousands of clients, a server may not have enough memory to preserve the virtual circuit data structures for all potential users of the mass storage devices under the server's control. Thus, some clients may not be able to access files through a server until other clients on the system terminate virtual circuits facilitating communications with the server. In short, some clients may be effectively blocked from accessing the mass storage devices.