1. Field of the Invention
This invention relates to the field of data structure initialization techniques. Specifically, this invention is a method, apparatus, system and computer program product for efficiently invoking a programmed operation at the first active use of a data structure or class object. The programmed operation can be used, without limitation, to initialize the data structure or static variables in the class object.
2. Background
A computer's central processing unit (CPU) is designed to execute computer instructions that perform operations on data-values. These data-values are generally stored in variables in memory and in the CPU's registers. Most general purpose computers have specialized instructions for accessing variables having different lengths. For example, a byte-oriented memory-access mode will access a single byte of memory at any byte addressable address in memory. Other non-byte access-modes are used to access two, four, eight, sixteen byte or other sized variables. Each of the non-byte access-modes requires that the variable be aligned on an even memory address. The computer raises a fault condition (a misaligned memory access fault) if a non-byte access-mode is attempted to an odd byte address. This fault condition causes an exception in the computer system and invokes a trap.
Data structures are used to organize information stored in a computer system. Programmed-routines, in a procedural programming paradigm, access the data structures to perform operations dependent on, and/or to modify the contents of, the data structures. These data structures are initialized after they are allocated. The order in which these data structures are initialized is important especially when the initial state of one data structure depends on the state of another data structure. One skilled in the art will understand that the data structure may be pre-initialized by the allocation process to set data fields within the data structure to some default value (usually zero). This pre-initialization is different from the initialization required to set particular data structure elements to initial non-default values.
Object-oriented programming (OOP) languages encapsulate an object's data (generally contained in the object as a data structure) with associated OOP methods for operating on that object's data. Usually, OOP objects are instantiated in a heap memory area and are based on classes that reference the programmed methods for each OOP object. Instantiated OOP objects are accessed through pointers and contain data (in instance variables) specific to that particular instantiated OOP object. Conceptually, an OOP object contains object-related information (such as the number of instance variables in the object), the instance variables, and addresses of programmed routines (OOP methods) that access and/or manipulate the contents of the instance variables in the object. However, because objects often share programmed routines and object-related information, this shared information is usually extracted into a class. Thus, the instantiated object simply contains its own instance variables and a pointer to its class.
The invention applies to both data structures and OOP objects (such as class objects).
Smalltalk, Java and C++ are examples of OOP languages. Smalltalk was developed in the Learning Research Group at Xerox's Palo Alto Research Center (PARC) in the early 1970s. C++ was developed by Bjarne Stroustrup at the AT&T Bell Laboratories in 1983 as an extension of C. Java is an OOP language with elements from C and C++ and includes highly tuned libraries for the internet environment. It was developed at SUN Microsystems and released in 1995.
Further information about OOP concepts may be found in Not Just Java by Peter van der Linden, .COPYRGT. Sun Microsystems Press/Prentice Hall PTR Corp., Upper Saddle River, N.J., (1997), ISBN 0-13-864638-4, pages 136-149 which is incorporated herein by reference.
Some OOP languages (such as the JAVA programming language) allow class variables. These class variables allow each instantiated object to access a common instance variable shared by each instantiated object that depends on the class. The static class variables must be initialized prior to their use by any of the instantiated objects. The Java programming language specification requires static class variables to be initialized at the first active use of the class. Further information about initialization of Java classes may be found in The Java.TM. Language Specification by Gosling, Joy, and Steele, .COPYRGT. Sun Microsystems, Inc., Addison-Wesley, ISBN 0-201-63451-1, pages 223-227 which is incorporated herein by reference.
FIG. 1A illustrates a class object data structure, indicated by general reference character 100, that illustrates a data structure used as an OOP class object. The class object data structure 100 includes a `status` field 101 that contains, among other information, the initialization state of the class object. The class object data structure 100 also includes a `static class variable` field 103 used to store the contents of the static class instance variable. The class object data structure 100 also includes a `class method pointer` field 105 that contains an access mechanism to the class' methods such as an array of pointers to these methods.
FIG. 1B illustrates a prior art `data structure access` process, indicated by general reference character 120, used to initialize a data structure at its first active use. The prior art process 120 initiates at a `start` terminal 121 and continues to a `data structure initialized` decision procedure 123 that checks the `status` field 101 of the data structure to determine whether the data structure has been initialized. If the data structure has not been initialized, the prior art process 120 continues to an `initialize data structure` procedure 125 that performs the initialization and modifies the contents of the `status` field 101 to indicate that the data structure has been initialized. Once the `initialize data structure` procedure 125 completes, or if the `data structure initialized` decision procedure 123 determined that the data structure has already been initialized, the prior art process 120 continues to an `access data structure` procedure 127 that performs an active use of the data structure. The prior art process 120 completes through an `end` terminal 129.
The major disadvantage of this prior art approach is that every access to a data structure element in the data structure requires that the computer check whether the data structure has been initialized. Compiler optimization technology exists to optimize out redundant checks by exploiting the fact that only successful checks reach other checks within a routine. In addition, computationally expensive inter-routine analysis can be used to optimize out redundant checks across routine boundaries assuming that the execution sequence can be determined. However, these compiler optimization techniques are computationally very expensive and as a result are often not used.
FIG. 1C illustrates a prior art `adaptive optimization` process, indicated by general reference character 150, for self modifying the executing program code to optimize access to the data structure elements. Using this method, the compiler generates code to invoke an access check routine instead of computer operations to directly access the data structure element. As each access check is encountered, the call-site used to invoke the access check routine is modified to overwrite the invocation of the access check routine with instructions for directly accessing the data structure element. In addition, the access check routine initializes the data structure the first time the access check routine is called on the data structure.
The optimization process 150 initiates at a `start` terminal 151 and continues to an `invoke access check routine` procedure 153. The `invoke access check routine` procedure 153 occurs from the call site at the point where the program would normally access the data structure element. The access check routine, at a `first data structure access` decision procedure 155, then evaluates the `status` field 101 to determine whether the class object data structure 100 has been initialized. If the data structure has not been initialized, the access check routine performs the required initialization at an `initialize data structure` procedure 157. Next, the optimization process 150 continues to a `patch runtime call site` procedure 159. The `patch runtime call site` procedure 159 modifies the computer instructions at the call site to replace the invocation instructions for the access check routine with instructions that actually access the data structure elements in the class object data structure 100 and thus optimize subsequent processing. Generally the `patch runtime call site` procedure 159 also patches other call sites that access the data structure so that only one invocation of the `invoke access check routine` procedure 153 is needed for each data structure.
However, if the class object data structure 100 has already been initialized (or when the `patch runtime call site` procedure 159 completes) the optimization process 150 continues to an `access data structure` procedure 161 (defined by the inserted instructions) that accesses the desired data structure element. Then the optimization process 150 completes through an `end` terminal 163.
The major disadvantage with this prior art method is that the executable program code is self modifying. This solution is not acceptable in many environments. Self modifying code is also very difficult to debug and maintain.
It would be advantageous to provide a technique for a first active use initialization of data structures that does not self-modify executing code nor require special case compiler optimizations and is more efficient than the prior art techniques. Such an inventive technique would improve the performance of computers that use the technique.