Computer networks allow large-scale integration of work stations or even "personal" "mini" or mainframe computers each containing a central processing unit, keyboard, memory, and the like sufficient to operate as stand-alone processors, thereby overcoming the disadvantages of these types of computers as compared to centralized processors coupled with remote but "dumb" terminals. Networks allow resource sharing, which in turn allows users to work more efficiently together. The operator of computer A can use the databases located on the memory of computer B; the operator of computer B can use the printer, or even the central processing unit, of computer A. Furthermore, networks provide improved configuration control over stand-alone systems each subject to reconfiguration by the individual user.
The network itself consists of both hardware and software. The hardware includes the actual communication links such as electrical connections, cables, modem interfaces, buses, and the like between the individual computers; the software includes the programs required to communicate data via the hardware.
The network software is required because memory allocated for information in the address space of one computer in the network may not always match the address space of a second computer. Accordingly, the information to be sent across the network must be "packaged" as a data structure which will have meaning to the receiving computer.
The network software is also required because the communication channels between the computers on the network spread the information out through time while the address space in computer memory spreads the information through space. Accordingly, various error checking schemes must be incorporated to assure that whatever information sent over the network's communication channels is correctly received by the receiving computer.
The following lexicon of network terms has been established by those skilled in the art. Data elements are the internal representation of information stored by computers. A "data element" is a computer representation of numbers, characters, or "pointers." These data elements occupy computer memory that consists of between one and eight bytes of memory. A "pointer" is a special type of data element that is a computer number that contains the beginning memory address of more data elements or data structures. "Data structures" are a collection of data elements or pointers that describe some higher level data object. Data structures that contain pointers have a hierarchical form that may contain multiple levels of subdata structures or elements. An "argument" is a data element or data structure passed to a procedure as input or output. A "procedure," also called a "subroutine" or "function" is a modular piece of a larger computer program. An "RPC call" is a function which makes a procedure call in a remote process as though the procedure was made locally. "RPC Programming" is further defined below.
"Distributed systems" is terminology generally referring to the network software which enables access and transfer of resources on the network automatically. Hence, for example, although one resource truly exists on the memory of computer A, a distributed system allows the operator of computer B to operate computer B as if the resource existed on her computer. Distributed systems are preferred in network applications because the operators of the various personal computers need not be skilled in the networking art. Distributed systems, however, which optimally remain "transparent" to the user and yet allow many different kinds of equipment to participate within the network, present difficult design challenges.
It is known to use subsystems for writing the distributed system software. A subsystem is a set of routines located in a library that provides specific functions on which to build a network application. A platform may consist of "remote procedure call" (RPC) routines and "external data representation" (XDR) routines located in a library.
RPC programming provides an abstraction of making a procedure or subroutine call across a network into a server process. Collections or subsystems of procedure calls are partitioned into a server process in order that they can exist as a network service. This technology is called "client-server computing". The advantage of RPC programming is that if implemented correctly, the network itself is completely invisible to the individual application: if one were to examine the source code of a given application, the source code required by the network would be invisible.
This abstraction is created by conventional RPC compilers to the point where the application can be run, either linked together as a monolith or partitioned into pieces and run distributed in a client-server model. An RPC compiler generates the source code to stub the existing application procedure, package and pass the appropriate procedure argument into a high-level language data structure to be sent to a server process, generate XDR code to encode and decode data to and from the network, unpack arguments and make the real procedure call, and then package and return arguments into the high-level language data structure to be sent back to a client process.
The XDR routine recursively descends through a data structure while encoding and decoding to and from the network. XDR is also used to create data structures that were memory allocated during the decode phase of data transfer.
During the decode phase, however, XDR allocates memory whenever it decodes a pointer from the network. XDR decodes data from the network once on the server side when input arguments are passed to the remote procedure, and once on the client side when the return arguments are passed back to the client application.
One serious disadvantage to this type of network programming however, is "memory leak". Pointers reside in an address space and contain only information denoting the address space of other information. Since a pointer passed across the network would be meaningless to the receiving computer, the client application will pass a pointer to a subroutine and the subroutine will chase pointers through a chain of data elements linked by pointers so that real data is passed. In order to accomplish this task, XDR must allocate memory when it decodes a pointer.
When more than one level of pointer is used, for example, a pointer to a pointer or a pointer to a structure containing other pointers and strings, the possibility exists that the memory space taken up by these structures will never be freed. Therefore, those skilled in the networking art must include routines within their network application to both copy all the relevant data elements back into the client application and free the memory spaces allocated for those data elements. If not freed, memory space will decrease rapidly affecting memory allocation and processing time. In fact, the common complaint that the computer is "acting up" after running many programs over long periods of time is often attributable to memory leakage.
Manual modification of each network application, however, destroys the abstraction that the network is invisible. Moreover, not all programmers and computer users are sufficiently skilled to effectuate such modifications. Unfortunately, since the possible permutations of pointer linked data chains is virtually unlimited, no unified scheme has been developed to effectively copy all the required data and also free memory space no longer needed.
On the other hand, automatic freeing of memory allocated for data elements containing pointers may be accomplished by "chasing" the pointers and then recursively freeing each data element. This scheme, however, is not always desirable since data at the end of a pointer chain may need to be preserved in some cases. Automatically freeing this memory may result in a loss of valuable, needed information.