Current document processing computer systems allow a user to prepare compound documents. A compound document is a document that contains information in various formats. For example, a compound document may contain data in text format, chart format, numerical format, etc. FIG. 1 is an example of a compound document. In this example, the compound document 101 is generated as a report for a certain manufacturing project. The compound document 101 contains scheduling data 102, which is presented in chart format; budgeting data 103, which is presented in spreadsheet format; and explanatory data 104, which is presented in text format. In typical prior systems, a user generates the scheduling data 102 using a project management computer program and the budgeting data 103 using a spreadsheet computer program. After this data has been generated, the user creates the compound document 101, enters the explanatory data 104, and incorporates the scheduling data 102 and budgeting data 103 using a word processing computer program.
FIG. 2 shows a method for incorporating the scheduling data, budgeting data, and explanatory data into the compound document. A user generates scheduling data using the project management program 201 and then stores the data in the clipboard 203. The user also generates budgeting data using the spreadsheet program 204 and then stores the data in the clipboard 203. The clipboard 203 is an area of storage (disk or memory) that is typically accessible by any program and is used to transfer data between programs. The project management program 201 and the spreadsheet program 204 typically store the data into the clipboard in a presentation format. A presentation format is a format in which the data is easily displayed on an output device. For example, the presentation format may be a bitmap that can be displayed with a standard bitmap block transfer operation (BitBlt). The storing of data into a clipboard is referred to as “copying” to the clipboard.
After data has been copied to the clipboard 203, the user starts up the word processing program 206 to create the compound document 101. The user enters the explanatory data 104 and specifies the locations in the compound document 101 to which the scheduling data and budgeting data that are in the clipboard 203 are to be copied. The copying of data from a clipboard to a document is referred to as “pasting” from the clipboard. The word processing program 206 then copies the scheduling data 102 and the budgeting data 103 from the clipboard 203 into the compound document 101 at the specified locations. Data that is copied from the clipboard into a compound document is referred to as “embedded” data. The word processing program 206 treats the embedded data as simple bitmaps that it displays with a BitBlt operation when rendering the compound document 101 on an output device. In some prior systems, a clipboard may only be able to store data for one copy command at a time. In such a system, the scheduling data can be copied to the clipboard and then pasted into the compound document. Then, the budgeting data can be copied to the clipboard and then pasted into the compound document.
Since word processors typically process only text data, users of the word processing program can move or delete embedded data, but cannot modify embedded data, unless the data is in text format. Thus, if a user wants to modify, for example, the budgeting data 103 that is in the compound document 101, the user must start up the spreadsheet program 204, load in the budgeting data 103 from a file, make the modifications, copy the modifications to the clipboard 203, start up the word processing program 206, load in the compound document 101, and paste the modified clipboard data into the compound document 101.
Some prior systems store links to the data to be included in the compound document rather than actually embedding the data. When a word processing program pastes the data from a clipboard into a compound document, a link is stored in the compound document. The link points to the data (typically residing in a file) to be included. These prior systems typically provide links to data in a format that the word processing program recognizes or treats as presentation format. For example, when the word processing program 206 is directed by a user to paste the scheduling data and budgeting data into the compound document by linking, rather than embedding, the names of files in which the scheduling data and budgeting data reside in presentation format are inserted into the document. Several compound documents can contain links to the same data to allow one copy of the data to be shared by several compound documents.
A link is conceptually a path name to the data. Some prior systems store two-level links. A two-level link identifies both a file and an area within the file. For example, the two-level link “\BUDGET.XLS\R2C2:R7C4” identifies a spreadsheet file “\BUDGET.XLS” and the range of cells “R2C2:R7C4.” The use of two-level links limits the source of the links to data that is nested one level within a file. If a file contains multiple spreadsheets, then a two-level link could identify the file and a spreadsheet, but could not identify a range within the spreadsheet. It would be desirable to have a method and system of supporting links to an arbitrary level.
Since the present invention is described below using object-oriented programming, an overview of well-known object-oriented programming techniques is provided. Two common characteristics of object-oriented programming languages are support for data encapsulation and data type inheritance. Data encapsulation refers to the binding of functions and data. Inheritance refers to the ability to declare a data type in terms of other data types.
In the C++ language, object-oriented techniques are supported through the use of classes. A class is a user-defined type. A class declaration describes the data members and function members of the class. For example, the following declaration defines data members and a function member of a class named CIRCLE.
class CIRCLE{ public:int x, y;int radius;void draw( );};Variables x and y specify the center location of a circle and variable radius specifies the radius of the circle. These variables are referred to as data members of the class CIRCLE. The function draw is a user-defined function that draws the circle of the specified radius at the specified location. The function draw is referred to as a function member of class CIRCLE. The data members and function members of a class are bound together in that the function operates on an instance of the class. An instance of a class is also called an object of the class.
In the syntax of C++, the following statement declares the objects a and b to be of type class CIRCLE.                CIRCLE a, b;This declaration causes the allocation of memory for the objects a and b. The following statements assign data to the data members of objects a and b.        a.x=2;        a.y=2;        a.radius=1;        b.x=4;        b.y=5;        b.radius=2;The following statements are used to draw the circles defined by objects a and b.        a.draw( );        b.draw( );        
A derived class is a class that inherits the characteristics—data members and function members—of its base classes. For example, the following derived class CIRCLE_FILL inherits the characteristics of the base class CIRCLE.
class CIRCLE_FILL : CIRCLE{ public:int pattern;void fill( );};This declaration specifies that class CIRCLE-FILL includes all the data and function members that are in class CIRCLE in addition to those data and function members introduced in the declaration of class CIRCLE FILL, that is, data member pattern and function member fill. In this example, class CIRCLE_FILL has data members x, y, radius, and pattern and function members draw and fill. Class CIRCLE_FILL is said to “inherit” the characteristics of class CIRCLE. A class that inherits the characteristics of another class is a derived class (e.g., CIRCLE_FILL). A class that does not inherit the characteristics of another class is a primary (root) class (e.g., CIRCLE). A class whose characteristics are inherited by another class is a base class (e.g., CIRCLE is a base class of CIRCLE_FILL). A derived class may inherit the characteristics of several classes, that is, a derived class may have several base classes. This is referred to as multiple inheritance.
A derived class may specify that a base class is to be inherited virtually. Virtual inheritance of a base class means that only one instance of the virtual base class exists in the derived class. For example, the following is an example of a derived class with two nonvirtual base classes.                class CIRCLE—1: CIRCLE { . . . };        class CIRCLE—2: CIRCLE { . . . };        class PATTERN: CIRCLE—1, CIRCLE—2{ . . . };In this declaration class PATTERN inherits class CIRCLE twice nonvirtually through classes CIRCLE—1 and CIRCLE—2. There are two instances of class CIRCLE in class PATTERN.        
The following is an example of a derived class with two virtual base classes.                class CIRCLE—1: virtual CIRCLE { . . . };        class CIRCLE—2: virtual CIRCLE { . . . };        class PATTERN: CIRCLE—1, CIRCLE—2{ . . . };The derived class PATTERN inherits class CIRCLE twice virtually through classes CIRCLE—1 and CIRCLE—2. Since the class CIRCLE is virtually inherited twice, there is only one object of class CIRCLE in the derived class PATTERN. One skilled in the art would appreciate virtual inheritance can be very useful when the class derivation is more complex.        
A class may also specify whether its function members are virtual. Declaring that a function member is virtual means that the function can be overridden by a function of the same name and type in a derived class. In the following example, the function draw is declared to be virtual in classes CIRCLE and CIRCLE_FILL.
class CIRCLE{ public:int x, y;int radius;virtual void draw( );};class CIRCLE_FILL : CIRCLE{ public:int pattern;virtual void draw( );};
The C++ language provides a pointer data type. A pointer holds values that are addresses of objects in memory. Through a pointer, an object can be referenced. The following statement declares variable c_ptr to be a pointer on an object of type class CIRCLE and sets variable c_ptr to hold the address of object c.                CIRCLE*c_ptr;        c_ptr=&c;Continuing with the example, the following statement declares object a to be of type class CIRCLE and object b to be of type class CIRCLE_FILL.        CIRCLE a;        CIRCLE_FILL b;The following statement refers to the function draw as defined in class CIRCLE.        a.draw( );Whereas, the following statement refers to the function draw defined in class CIRCLE_FILL.        b.draw( );Moreover, the following statements type cast object b to an object of type class CIRCLE and invoke the function draw that is defined in class CIRCLE_FILL.        
CIRCLE *c_ptr;c_ptr = &bc_ptr−>draw( );// CIRCLE_FILL::draw( )Thus, the virtual function that is called is function CIRCLE_FILL::draw.
FIG. 3 is a block diagram illustrating typical data structures used to represent an object. An object is composed of instance data (data members) and member functions, which implement the behavior of the object. The data structures used to represent an object comprise instance data structure 301, virtual function table 302, and the function members 303, 304, 305. The instance data structure 301 contains a pointer to the virtual function table 302 and contains data members. The virtual function table 302 contains an entry for each virtual function member defined for the object. Each entry contains a reference to the code that implements the corresponding function member. The layout of this sample object conforms to the model defined in U.S. patent application Ser. No. 07/682,537, entitled “A Method for Implementing Virtual Functions and Virtual Bases in a Compiler for an Object Oriented Programming Language,” which is hereby incorporated by reference. In the following, an object will be described as an instance of a class as defined by the C++ programming language. One skilled in the art would appreciate that objects can be defined using other programming languages.
An advantage of using object-oriented techniques is that these techniques can be used to facilitate the sharing of objects. In particular, object-oriented techniques facilitate the creation of compound documents. A compound document (as described above) is a document that contains objects generated by various computer programs. (Typically, only the data members of the object and the class type are stored in a compound document.) For example, a word processing document that contains a spreadsheet object generated by a spreadsheet program is a compound document. A word processing program allows a user to embed a spreadsheet object (e.g., a cell) within a word processing document. To allow this embedding, the word processing program is compiled using the class definition of the object to be embedded to access function members of the embedded object. Thus, the word processing program would need to be compiled using the class definition of each class of objects that can be embedded in a word processing document. To embed an object of a new class into a word processing document, the word processing program would need to be recompiled with the new class definition. Thus, only objects of classes selected by the developer of the word processing program can be embedded. Furthermore, new classes can only be supported with a new release of the word processing program.
To allow objects of an arbitrary class to be embedded into compound documents, interfaces are defined through which an object can be accessed without the need for the word processing program to have access to the class definitions at compile time. An abstract class is a class in which a virtual function member has no implementation (pure). An interface is an abstract class with no data members and whose virtual functions are all pure.
The following class definition is an example definition of an interface. In this example, for simplicity of explanation, rather than allowing any class of object to be embedded in its documents, a word processing program allows spreadsheet objects to be embedded. Any spreadsheet object that provides this interface can be embedded, regardless of how the object is implemented. Moreover, any spreadsheet object, whether implemented before or after the word processing program is compiled, can be embedded.
classISpreadSheet{ virtual void File( ) = 0;virtual void Edit( ) = 0;virtual void Formula( ) = 0;virtual void Format( ) = 0;virtual void GetCell (string RC, cell *pCell) = 0;virtual void Data( ) = 0;}The developer of a spreadsheet program would need to provide an implementation of the interface to allow the spreadsheet objects to be embedded in a word processing document. When the word processing program embeds a spreadsheet object, the program needs access to the code that implements the interface for the spreadsheet object. To access the code, each implementation is given a unique class identifier.
For example, a spreadsheet object developed by Microsoft Corporation may have a class identifier of “MSSpreadsheet,” while a spreadsheet object developed by another corporation may have a class identifier of “LTSSpreadsheet.” A persistent registry in each computer system is maintained that maps each class identifier to the code that implements the class. Typically, when a spreadsheet program is installed on a computer system, the persistent registry is updated to reflect the availability of that class of spreadsheet objects. So long as a spreadsheet developer implements each function member defined by the interface and the persistent registry is maintained, the word processing program can embed the developer's spreadsheet objects into a word processing document.
Various spreadsheet developers may wish, however, to implement only certain function members. For example, a spreadsheet developer may not want to implement database support, but may want to support all other function members. To allow a spreadsheet developer to support only some of the function members, while still allowing the objects to be embedded, multiple interfaces for spreadsheet objects are defined. For example, the interfaces IDatabase and IBasic may be defined for a spreadsheet object as follows.
classIBasic{ virtual void File( ) = 0;virtual void Edit( ) = 0;virtual void Formula( ) = 0;virtual void Format( ) = 0;virtual void GetCell (string RC, cell *pCell) = 0;}classIDatabase{ virtual void Data( ) = 0;}Each spreadsheet developer would implement the IBasic interface and, optionally, the IDatabase interface.
At run time, the word processing program would need to determine whether a spreadsheet object to be embedded supports the IDatabase interface. To make this determination, another interface is defined (that every spreadsheet object implements) with a function member that indicates which interfaces are implemented for the object. This interface is named IUnknown (and referred to as the unknown interface or the object management interface) and is defined as follows.
class IUnknown{ virtual HRESULT QueryInterface (REFIID iid, void **ppv) = 0;virtual ULONG AddRef( ) = 0;virtual ULONG Release( ) = 0;}The IUnknown interface defines the function member (method) QueryInterface. The method QueryInterface is passed an interface identifier (e.g., “IDatabase”) in parameter iid (of type REFIID) and returns a pointer to the implementation of the identified interface for the object for which the method is invoked in parameter ppv. If the object does not support the interface, then the method returns a false. (The type HRESULT indicates a predefined status, and the type ULONG indicates an unsigned long integer.)
CODE TABLE 1HRESULT XX::QueryInterface(REFIID iid, void **ppv){ ret = TRUE;switch (iid){ case IID_IBasic:*ppv = *pIBasic;break;case IID_IDatabase:*ppv = *pIDatabase;break;case IID_IUnknown:*ppv = this;break;default:ret = FALSE;}if (ret == TRUE){AddRef( );};return ret;}
Code Table 1 contains C++ pseudocode for a typical implementation of the method QueryInterface for class XX, which inherits the class IUnknown. If the spreadsheet object supports the IDatabase interface, then the method QueryInterface includes the appropriate case label within the switch statement. The variables pIBasic and pIDatabase point to a pointer to the virtual function tables of the IBasic and IDatabase interfaces, respectively. The method QueryInterface invokes the method AddRef (described below) to increment a reference count for the object of class XX when a pointer to an interface is returned.
CODE TABLE 2void XX::AddRef( ) {refcount++;}void XX::Release( ) {if (--refcount==0) delete this;}
The interface IUnknown also defines the methods AddRef and Release, which are used to implement reference counting. Whenever a new reference to an interface is created, the method AddRef is invoked to increment a reference count of the object. Whenever a reference is no longer needed, the method Release is invoked to decrement the reference count of the object and, when the reference count goes to zero, to deallocate the object. Code Table 2 contains C++ pseudocode for a typical implementation of the methods AddRef and Release for class XX, which inherits the class IUnknown.
The IDatabase interface and IBasic interface inherit the IUnknown interface. The following definitions illustrate the use of the IUnknown interface.
class IDatabase : public IUnknown{ public:virtual void Data( ) = 0;}class IBasic : public IUnknown{ public:virtual void File( ) = 0; virtual void Edit( ) = 0; virtual void Formula( ) = 0; virtual void Format( ) = 0; virtual void GetCell (string RC, cell *pCell) = 0;}
FIG. 4 is a block diagram illustrating a sample data structure of a spreadsheet object. The spreadsheet object comprises object data structure 401, IBasic interface data structure 403, IDatabase interface data structure 404, the virtual function tables 402, 405, 406 and methods 407 through 421. The object data structure 401 contains a pointer to the virtual function table 402 and pointers to the IBasic and IDatabase interface. Each entry in the virtual function table 402 contains a pointer to a method of the IUnknown interface. The IBasic interface data structure 403 contains a pointer to the virtual function table 405. Each entry in the virtual function table 405 contains a pointer to a method of the IBasic interface. The IDatabase interface data structure 404 contains a pointer to the virtual function table 406. Each entry in the virtual function table 406 contains a pointer to a method of the IDatabase interface. Since the IBasic and IDatabase interfaces inherit the IUnknown interface, each virtual function table 405 and 406 contains a pointer to the methods QueryInterface, AddRef, and Release. In the following, an object data structure is represented by the shape 422 labeled with the interfaces through which the object may be accessed.
The following pseudocode illustrates how a word processing program determines whether a spreadsheet object supports the IDatabase interface.
if (pIBasic−> QueryInterface(“IDatabase”, &pIDatabase) ==S_OK)\* IDatabase supportedelse\* IDatabase not supportedThe pointer pIBasic is a pointer to the IBasic interface of the object. If the object supports the IDatabase interface, the method QueryInterface sets the pointer pIDatabase to point to the IDatabase data structure and returns the value S_OK.
Normally, an object can be instantiated (an instance of the object created in memory) by a variable declaration or by the “new” operator. However, both techniques of instantiation need the class definition at compile time. A different technique is needed to allow a word processing program to instantiate a spreadsheet object at run time. One technique provides a global function CreateInstanceXX, which is defined in the following.    static void CreateInstanceXX (REFIID iid, void **ppv)=0;The method CreateInstanceXX (known as a class factory) instantiates an object of class XX and returns a pointer ppv to the interface of the object designated by parameter iid.