The present invention relates to a storage technique for pointers in data structure of a program, and in particular, to a technique to reduce the overhead in compression and expansion of pointers and to downsize the data structure.
To easily handle large-sized data including programs, some of the recent microprocessors support a 64-bit address space.
By expanding the number of address bits from 32 to 64, the maximum memory space which a program can refer to increases from four GigaBytes (GB; ten to the ninth power bytes) to 16 ExaBytes (EB; ten to the 18th power bytes). However, general users require a memory space less than several gigabytes in many cases. The 32-bit address space is sufficient for most computer systems today.
A pointer to designate a 64-bit address is larger in size than a pointer for a 32-bit address. That is, for one and the same data structure to be treated by a program, the ratio of an area occupied by the pointers in the memory is larger for the 64-bit processor than the 32-bit processor.
In object-oriented programs such as Java programs, pointers frequently appear in data structure called “object”. Therefore, the expansion of the pointers exerts a considerable influence upon such programs. Since the memory is finite, there possibly occurs a case in which part of the data structure including pointers cannot be fully stored in the memory.
When compared with conventional programming languages such as FORTRAN having a tendency to repeatedly use simple data structure, the languages such as Java are deteriorated in the utilization efficiency of the cache memory. Description will next be given of the utilization efficiency.
To refer to the main memory, a general microprocessor requires a long period of time corresponding to several hundred of cycles. To avoid this disadvantageous operation, there is disposed between the main memory and registers a cache memory which is a small-sized memory which can be accessed at a high speed. The processor includes a function to keep a copy of accessed data in the cache. If the cache contains data equal to data to be referred to, the processor can refer to the data in the cache at a high speed. However, the capacity of the cache is finite. If the amount of data increases, the number of data items storable in the cache becomes smaller.
To cope with the difficulty, there has been proposed a method for use with a program using the Java language of compressing pointer representations in an object allocated in a heap memory of Java in an article “A. Adl-Tabatabai, J. Bharadwaj, M. Cierniak, M. Eng, J. Fang, B. Lewis, B. Murphy, and J. Stichnoth, Improving 64-Bit Java IPF Performance by Compressing Heap References, In Proceedings of the International Symposium on Code Generation and Optimization, 2004”. According to the method described in the article, a pointer designating an object secured in the Java heap memory is expressed using a base address indicating a first address of the heap memory and an offset value relative to the base address.
FIG. 3 shows a layout of such memory space. In a memory space 301 using 64-bit addresses, a heap memory 302 includes successive addresses beginning at a base address 304 indicated by a variable “base”. An object is placed in the heap memory 302. The base address 304 is fixed. Assume that the maximum value of the heap memory address is limited to four gigabytes corresponding to the maximum value representable by 32 bits. When the pointer indicating an object in the heap memory is specified by the offset 305 relative to the base address 304, the pointer can be expressed in the form in which the address is compressed to 32 bits.
In this connection, for example, if the object allocation address is a multiple of eight, a larger heap memory area can be handled, for example, by setting a multiple of eight to the offset value. As above, the size of the data structure including pointers can be reduced by compressing the number of bits allocated to pointers from 64 to 32. This results in increase in the number of data items storable in the cache memory as above, which hence improves the cache hit ratio.
To actually refer to data in an object A, a pointer value in a uncompressed format is required. Therefore, when a computer system makes reference to or refers to an object indicated by a pointer defined in the compressed format, it is necessary to convert the pointer expression from the compressed format into the uncompressed format. Conversely, to store a pointer value in the uncompressed format, the pointer is required to be converted from the uncompressed format into the compressed format.
The conversion from the compressed format into the uncompressed format is carried out by adding the base address to the pointer value in the compressed format. The conversion from the uncompressed format into the compressed format is conducted by subtracting the base address from the address of the object in the uncompressed format. However, attention is to be given to a particular value called “null” which is prepared as a pointer value not specifying any valid data in most programming languages including Java. As shown in FIG. 3, the value of a null 306 is set to “0” in general. When the address of an object 303 is expressed by an offset 305 relative to a base address 304, the null value cannot be expressed without modification. It is therefore necessary to treat a situation in which the pointer value is “null”, as a special case.
FIGS. 4A and 4B show conventional examples of pointer compression and expansion sequences. That is, FIGS. 4A and 4B show pseudo codes representing load and store processing sequences for pointers in the compressed format in which attention has been given to the null described above. To load a pointer in the compressed format, a check is made to determine whether or not the loaded value is “0” (value of the null). If the value is “0”, the pointer value is corrected (401). Similarly, in the store sequence using a pointer in the compressed format, a check is made to determine the value of the target of “store”. If the value is “0”, the pointer value is corrected (402).
FIGS. 5A and 5B show examples of effect of the pointer compression in the prior art.
A numeral 501 indicates an example of an object definition as a target of pointer compression. It is assumed in the object definition that “left”, “right”, “name”, and “value” are pointers to indicate respective objects. It is assumed that the header to store items such as an attribute of the object occupies two words. In a situation in which a 64-bit processor employs the pointer in the uncompressed format, the target object occupies a memory area of ten words as indicated by a numeral 502. On the other hand, if the pointer is expressed in the compressed format, the object occupies an area of six words as indicated by a numeral 503. That is, four words are saved in the memory use.