Computer systems typically have operating systems that support multitasking. A multitasking operating system allows multiple tasks (processes) to be executing concurrently. For example, a database server process may execute concurrently with many client processes, which request services of the database server process. A client process (client) may request services by issuing a remote procedure call (RPC). A remote procedure call allows a server process (server) to invoke a server procedure on behalf of the client. To issue a remote procedure call, the client packages the procedure name and the actual in-parameters for the procedure into an interprocess communications message and sends the message to the server. The server receives the message, unpackages the procedure name and any actual in-parameters, and invokes the named procedure, passing it the unpackaged in-parameters. When the procedure completes, the server packages any out-parameters into a message and sends the message to the client. The client receives the message and unpackages the out-parameters. The process of packaging parameters is known as marshalling, and the process of unpackaging parameters is known as unmarshalling.
Parameters may be marshalled by storing a copy of the value of the actual parameter in a message. For certain types of parameters, the marshalling may involve more than simply storing a copy of the value. For example, a floating point value may need to be converted from the format of one computer system to the format of another computer system when the processes reside on different computer systems.
The copying of the value of an actual parameter has a couple disadvantages. First, when a copy is passed, changes to the original value are not reflected in the copy. For example, if a parameter representing a time of day value is passed from a client to a server by copying, then the copy that the server receives is not updated as the client updates its time of day value. Second, with certain types of parameters, it may be impractical to make a copy of the value. For example, the overhead of copying a large array may be unacceptable. As discussed in the following, it may also be impractical to make a copy of an object when marshalling the object because the object may be large and include various functions.
The use of object-oriented programming techniques have many advantages over prior techniques. Thus, the use of object-oriented techniques is increasing. However, the inability to efficiently marshal and unmarshal objects when invoking remote procedures limit these advantages. A description of how objects are typically marshalled and unmarshalled will help to explain these limits.
FIG. 1 is a block diagram illustrating typical data structures used to represent an object. An object is composed of instance data (data members) and member functions, which implement the behavior of the object. The data structures used to represent an object comprise instance data structure 101, virtual function table 102, and the function members 103, 104, 105. The instance data structure 102 contains a pointer to the virtual function table 102 and contains data members. The virtual function table 102 contains an entry for each virtual function member defined for the object. Each entry contains a reference to the code that implements the corresponding function member. The layout of this sample object conforms to the model defined in U.S. Pat. No. 5,297,284, entitled "A Method for Implementing Virtual Functions and Virtual Bases in a Compiler for an Object Oriented Programming Language," which is hereby incorporated by reference. In the following, an object will be described as an instance of a class as defined by the C++ programming language. One skilled in the art would appreciate that objects can be defined using other programming languages.
If an object in a server process is to be copied and passed to a client process during a remote procedure call, then not only the data members must be copied, but also the function members must be accessible to the client process. To access the copied object, the client process would need to load each function member into its own process space. This loading can be time consuming. Moreover, the copying of an object may be intractable because a function member loaded in the server process space may need to access data or other functions in the server process space.
An advantage of using object-oriented techniques is that these techniques can be used to facilitate the creation of compound documents. A compound document is a document that contains objects generated by various computer programs. (Typically, only the data members of the object and the class type are stored in a compound document.) For example, a word processing document that contains a spreadsheet object generated by a spreadsheet program is a compound document. A word processing program allows a user to embed a spreadsheet object (e.g., a cell) within a word processing document. To allow this embedding, the word processing program would be compiled using the class definition of the object to be embedded to access function members of the embedded object. Thus, the word processing program would need to be compiled using the class definition of each class of objects that can be embedded in a word processing document. To embed an object of a new class into a word processing document, the word processing program would need to be recompiled with the new class definition. Thus, only objects of classes selected by the developer of the word processing program can be embedded. Furthermore, new classes can only be supported with a new release of the word processing program.
To allow objects of an arbitrary class to be embedded into compound documents, interfaces (abstract classes) are defined through which an object can be accessed without the need for the word processing program to have access to the class definitions at compile time. An abstract class is a class in which a virtual function member has no implementation (pure). An interface is an abstract class with no data members and whose virtual functions are all pure.
The following class definition is an example definition of an interface. In this example, for simplicity of explanation, rather than allowing any class of object to be embedded in its documents, a word processing program allows spreadsheet objects to be embedded. Any spreadsheet object that provides this interface can be embedded, regardless of how the object is implemented. Moreover, any spreadsheet object, whether implemented before or after the word processing program is compiled, can be embedded.
______________________________________ class ISpreadSheet { virtual void File( ) = 0; virtual void Edit( ) = 0; virtual void Formula( ) = 0; virtual void Format( ) = 0; virtual void GetCell (string RC, cell *pCell) = 0; virtual void Data( ) = 0; . . . } ______________________________________
The developer of a spreadsheet program would need to provide an implementation of the interface to allow the spreadsheet objects to be embedded in a word processing document. When the word processing program embeds a spreadsheet object, the program needs access to the code that implements the interface for the spreadsheet object. To access the code, each implementation is given a unique class identifier. For example, a spreadsheet object developed by Microsoft Corporation may have a class identifier of "MSSpreadsheet," while a spreadsheet object developed by another corporation may have a class identifier of "LTSSpreadsheet." A persistent registry in each computer system is maintained that maps each class identifier to the code that implements the class. Typically, when a spreadsheet program is installed on a computer system, the persistent registry is updated to reflect the availability of that class of spreadsheet objects. So long as a spreadsheet developer implements each function member defined by the interface and the persistent registry is maintained, the word processing program can embed the developer's spreadsheet objects into a word processing document.
Various spreadsheet developers may wish, however, to implement only certain function members. For example, a spreadsheet developer may not want to implement database support, but may want to support all other function members. To allow a spreadsheet developer to support only some of the function members, while still allowing the objects to be embedded, multiple interfaces for spreadsheet objects are defined. For example, the interfaces IDatabase and IBasic may be defined liar a spreadsheet object as follows.
______________________________________ class IDatabase { virtual void Data( ) = 0; } class IBasic { virtual void File( ) = 0; virtual void Edit( ) = 0; virtual void Formula( ) = 0; virtual void Format( ) = 0; virtual void GetCell (string RC, cell *pCell) = 0; . . . } ______________________________________
Each spreadsheet developer would implement the IBasic interface and, optionally, the IDatabase interface.
At run time, the word processing program would need to determine whether a spreadsheet object to be embedded supports the IDatabase interface. To make this determination, another interface is defined (that every spreadsheet object implements) with a function member that indicates which interfaces are implemented for the object. This interface is known as IUnknown and is defined by the following.
__________________________________________________________________________ class IUnknown { virtual boolean QueryInterface (iidInterface, pInterface) = 0; virtual boolean AddRef( ) = 0; virtual boolean Release( ) = 0 __________________________________________________________________________
The IUnknown interface defines the function member (method) QueryInterface. The method QueryInterface is passed an interface identifier (e.g., "IDatabase") and returns a pointer to the implementation of the identified interface for the object for which the method is invoked. If the object does not support the interface, then the method returns a false. The methods AddRef and Release provide reference counting of the interface.
The IDatabase interface and IBasic interface inherit the IUnknown interface. Inheritance is well known in object-oriented techniques by which a class definition can incorporate the data and function members of previously-defined classes. The following definitions illustrate the use of the IUnknown interface.
______________________________________ class IDatabase : IUnknown { virtual void Data( ) = 0; class IBasic : IUnknown { virtual void File( ) = 0; virtual void Edit( ) = 0; virtual void Formula( ) = 0; virtual void Format( ) = 0; virtual void GetCell (string RC, cell *pCell) = 0; . . . } ______________________________________
FIG. 2 is a block diagram illustrating a sample data structure of a spreadsheet object. The spreadsheet object comprises interface data structure 201, IBasic interface data structure 202, IDatabase interface data structure 205, and methods 208 through 212. The interface data structure 201 contains a pointer to each interface implemented and may contain data members of the implementation. The IBasic interface data structure 202 contains instance data structure 203 and virtual function table 204. Each entry in the virtual function table 204 points to a method defined for the IBasic interface. The IDatabase interface data structure 205 contains instance data structure 206 and virtual function table 207. Each entry in the virtual function table 207 contains a pointer to a method defined in the IDatabase interface. Since the IBasic and IDatabase interfaces inherit the IUnknown interface, each virtual function table 204 and 207 contains a pointer to the method QueryInterface 208. In the following, an object data structure is represented by the shape 213 labeled with an interface through which the object may be accessed.
The following pseudocode illustrates how a word processing program determines whether a spreadsheet object supports the IDatabase interface. EQU if (pIBasic.fwdarw.QueryInterface("IDatabase", &pIDatabase)) EQU .backslash.* IDatabase supported EQU else EQU .backslash.* IDatabase not supported
The pointer pIBasic is a pointer to the IBasic interface of the object. If the object supports the IDatabase interface, the method QueryInterface sets the pointer pIDatabase to point to the IDatabase data structure and returns true as its value.
Normally, an object can be instantiated (an instance of the object created in memory) by a variable declaration or by the "new" operator. However, both techniques of instantiation need the class definition at compile time. A different technique is needed to allow a word processing program to instantiate a spreadsheet object at run time. One technique provides an interface called IClassFactory, which is defined in the following.
______________________________________ class IClassFactory : IUnknown virtual void CreateInstance (iidInterface, &pInterface)=0; } ______________________________________
The method CreateInstance instantiates an object and returns a pointer pInterface to the interface of the object designated by argument iidInterface.
Although the use of the above described interfaces can be used to facilitate embedding objects in a compound document, an efficient technique is needed for allowing pointers to objects (interfaces) to be passed as parameters in a remote procedure call. The passing of pointers avoids the overhead of copying objects and allows the receiving process to see changes that the sending process makes to the object.