The present invention relates to a compiler, and more particularly to a Java (Trademark of Sun Microsystems Corp.) JIT (Just In Time) compiler.
Currently, Java is positioned for use not only as a common language of network computing, but also as a standard, object-oriented and platform-independent language. A program written in Java is compiled to produce a program in bytecode that can be executed by a Java virtual machine. This affords to a program written in Java the advantage that it can be run by any computer that supports Java, regardless of the CPU employed (multi-platform capability).
However, the execution of the bytecode by the virtual machine provides a performance that is inferior to that provided by the direct execution of a code that is written in a machine language. In general, therefore, while a Java program is executed, a JIT compiler converts the bytecode into a machine language code (a code that is hereinafter referred to as a JITed code), and the JITed code is executed instead of the bytecode. The minimum unit for the compile is a sub-routine called a method. By converting into the machine language code a code which is frequently executed, the performance characteristics of the machine language code are exhibited, while at the same time the multi-platform characteristic of the Java bytecode is retained.
Since the JITed code of the method is adequate for a CPU that executes a program, it is the equivalent of the optimal code generated by a C compiler, for instance. In a CPU that currently be in use, when a sub routine is called, in a stack, the sub-routine forms an area, which is called a frame, for the storage of local variables that are used by the sub-routine. FIG. 1 shows a stack after a sub-routine A has called a sub-routine B and the sub-routine B has called a sub-routine C. The stack has been extended upward, the individual areas constituting the frames of the sub-routines, while a stack pointer SP points to the topmost address in frame C. When the sub-routine C then calls a sub-routine D, a frame of the sub-routine D is formed on the current position pointed by the stack pointer SP. The stack pointer SP points to the topmost address in the newly formed frame. When returning from a certain sub-routine to the previous sub-routine, the frame for the certain sub-routine is removed. Thus, when returning form the sub-routine D to the sub-routine C, the frame of the sub-routine D is removed before the returning, and the address pointed by the stack pointer SP is returned to the one pointed at when the sub-routine C called the sub-routine D. In Java, a CPU resource is assigned for each unit execution, called a thread, and for each thread there is an inherent stack called a thread stack. The JITed code forms the previously mentioned frame (hereinafter referred to as a JITed frame) in the thread stack.
JITed frames are not the only ones found in the thread stack. JITed code may call a variety of service routines that a Java virtual machine provides, and some service routines may activate other, new Java virtual machines. FIG. 2 shows the state where frames of JITed code and frames of codes other that JITed code coexist in a thread stack. The frames identified by alphabetical characters are JITed frames.
The use of memory by a JIT compiler will now be explained. A JIT compiler uses memory for the storage of JITed code or for a compiling work area. However, the memory that is available is not infinite. In particular, a computer having no hard disk has no virtual memory by the secondary storage; such a computer has only real memory, and memory having only a limited capacity can be provided for the use of a JIT compiler. The use of such computers that have only real storage has spread because of the introduction of an NC (Network Computing) machine. The NC machine was proposed as a countermeasure to provide a reduction in the operating costs associated with the client management side in a server-client environment. Since application programs can be downloaded from the server, one of these computers has no need for a hard disk, and is available at a low price.
When only a limited amount of memory is available for a JIT compiler, while such a JIT compiler is in use a shortage of memory may occur. In such a case, one of three countermeasures can be selected: (1) the JIT compiler does not compile the pertinent method; (2) the optimization level of the JIT compiler is reduced and the JIT compiler switches its mode into one that uses less memory; or (3) the JIT compiler gets new free memory space. In this invention, (3) is employed. To get free memory, a part or all of the JITed codes which have already existed and which can be discarded is discarded to release areas in which the discarded JITed code previously occupied. This is a concept that was already known at the time the SmallTalk system was developed, but no method has been provided that can be applied for a frame optimized-by a JIT compiler that is as good as the one available for the C language.
Only JITed code is discarded, and the discarding process is called JIT code garbage collection (GC). An active method for a thread being executed is recorded in the current context of the thread (a copy of a CPU resource, such as a program counter and a stack pointer), or in a JITed code frame that is held in the thread stack. If the JITed code for an active method is discarded, the relevant thread can not be executed. Therefore, only non-active JITed code is to be discarded. Since all the methods for which JITed code is provided are managed, only the JITed code for non-active methods is discarded, while active methods are maintained.
A method for finding active methods, i.e., a method for finding JITed frames in a thread stack, will now be focused on. As shown in FIG. 2, JITed frames and other frames coexist in a thread stack.
Although the efficiency of search for JITed frames is low, a conservative garbage collection method (hereinafter referred to as a conservative GC method) can be employed while no overhead is imposed on the performance of JITed code (see FIG. 3). Specifically, all the effective areas in the thread stack are scanned, and all the values held in the thread stack are examined to determine whether they are addresses of JITed codes belonging to a specific method. If the addresses are of JITed code for a specific method, the method is regarded as active. If a JITed frame, including a pointer to JITed code, is present in the thread stack, the conservative GC method can find it. However, the conservative GC method has certain shortcomings: false JITed code addresses are extracted, and the discovery efficiency is low. Since all values in a stack are examined when the conservative GC method is used, if a JITed code address is unexpectedly included in the stack, it will mistakenly be identified as a real JITed code address. This gives rise to the shortcoming concerning false JITed code addresses. The reason the discovery efficiency is low will become apparent from the following. During the actual application of a program, the stack area tends to be large, and very many values, which are held in the thread stack, must be examined to determine whether it is the JITed code address of a specific method. In addition, the cost involved in obtaining a corresponding method from arbitrary values that are examined by scanning is high. Even if JITed codes are sorted according to the addresses in memory to which the JITed codes are assigned, the cost of extracting a method from JITed code addresses which are of several thousand, or of several tens of thousands, of JITed codes that are dynamically generated (or erased) will be very high. In addition to this, the period during which a program is halted due to the compiling process performed by a JIT compiler must be shortened as much as possible.
It is one object of the present invention to detect only JITed frames in an environment, in which the memory available to a JIT compiler is limited, and a JITed frame is optimized so that it is as good as an optimized code frame by a C compiler, and the JITed frames and other frames coexist in a thread stack.
It is an additional object of the present invention to detect JITed frames and to find an active method.
It is another object of the present invention to efficiently and rapidly detect only JITed frames.
It is a further object of the present invention to detect a code for a method that can be discarded.
The most efficient method from a viewpoint of the discovery efficiency is a method by which only JITed frames are managed (hereinafter referred to as a JITed last frame method) (see FIG. 4). A square in FIG. 4 (hereinafter referred to as a JITed last frame record) manages sequential JITed frames, and itself constitutes a list. The JITed last frame record is generated each time the JITed code calls non-JITed code, such as a service routine, and the list is updated. The JITed last frame method has two shortcomings: the possibility that a JITed frame will not be found, and the deterioration of the performance of the JITed code. The failure of discover of a JITed frame occurs in the following case. The JIT compiler performs the mapping of an exception in Java to an exception in CPU, if possible, in order to improve the performance. In the mapped system, assume that (1) a CPU exception (Java exception) occurs, (2) an exception handler is being executed, and (3) due to the shortage of memory the JIT compiler initiates the JIT code GC method for another thread. In this case, the latest JITed last frame record (black square in FIG. 4) is not formed in a thread that causes the exception even though a code other than JITed code is executed. Therefore, a JITed frame discovery failure occurs (JITed frames for methods C and D in FIG. 4). Further, since a JITed last frame record is generated each time JITed code calls non-JITed code, deterioration of the performance of the JITed code also occurs. For these reasons, a simple JITed last frame method can not be employed for a system in which a Java exception is mapped to a CPU exception and another thread during exception processing is operating, or a system in which emphasis is placed on performance.
Therefore, a hybrid method is employed with which the conservative GC method is combined, as needed, while the JITed last frame method is employed as a basis. With this method, the two shortcomings of the JITed last frame method, the failure of discovery of JITed frame and the deterioration of the performance, can be resolved, and reduction in the discovery efficiency is minimized by employing the conservative GC method specifically.
First, the JIT compiler detects a JITed code exit point to constitute a JITed last frame record. Basically, this point does not include non-JITed code calling that is to be frequently executed, but includes non-JITed code calling that is to be executed only once.
The JITed last frame record points at a frame of a JITed code that calls non-JITed code, and also points at a JITed last frame record that is formed last. These records constitute an LIFO (Last in First Out) list (see FIG. 5). In FIG. 5, a square represents a JITed last frame record, and a black square represents a JITed last frame record that is generated before the method D calls non-JITed code. The JITed last frame records and the LIFO list are managed for each thread.
The JIT code GC method is performed in the following manner. When the JIT compiler runs in short of memory in a specific thread, all the threads are temporarily suspended. Then, for each thread active methods are searched for, i.e., JITed code addresses in each thread stack are searched for. When there is no LIFO list of JITed last frame records, the conservative GC method is performed from a position pointed by the current stack pointer SP (obtained from a current context) up to the bottom of the stack. When there is a LIFO list, the conservative GC method is performed from a position pointed by the current stack pointer SP up to the address of a JITed frame pointed by the latest JITed last frame record in the list. Following this, the JITed last frame method is performed. These two methods may be performed in the inverted order. FIG. 6 is a specific diagram showing this process. In FIG. 6, a stack area upper than the JITed frame of JITed code E (an area from SP to a frame with label E added) is a frame of non-JITed code called from a point at which it is determined that a JITed last frame record will not be formed, or a frame for an exception handler that handles Java exception mapped to a CPU exception. When all active methods are found, methods whose JITed code can be discarded are selected from the remaining methods by referring to execution profile information. A histogram of total call counts is, for example, employed to select methods that have smaller calling counts. When as the result of discarding the JIT compiler obtains a free memory capacity larger than currently being requested, or when all the removable methods are discarded, the JIT code GC process is completed. Before completing the process, a fragmented area may be formed compact (memory blocks are moved and linked together so as to be as continuous as possible).
According to the present invention, the conservative GC method is partially performed to permit non-JITed code calling for which a JITed last frame record is not generated, and to prevent a failure to find an active method that is caused by the permission. In a system in which the memory capacity available for the JIT compiler is strictly limited and the memory tends to be insufficient, as many JITed last records as possible are formed to improve the efficiency of the JIT code GC process. On the other hand, in a system in which virtual memory is supported and the memory capacity is not so strictly limited, not many JITed last records are formed in which emphasis is placed on the performance of JITed code. As described above, according to the present invention, the frequency of the generation of JITed last records is changed in accordance with trade-offs between the memory requested and the execution performance. Therefore, compared with the discovery of an active method using only the conservative GC method, a system that is effective (as regards false JITed code) and efficient (as regards small scanning areas) can be provided.
A summary of the present invention will now be given. A JIT compiler according to the present invention performs the following processing, that is, following steps are included: if it is detected that a first routine has called a second routine, determining whether the second routine satisfies a predetermined condition; and if the second routine satisfies the predetermined condition, generating in codes for the first routine, a code for generating a record (JITed last frame record in the preferred embodiment) that points at a stack frame of the first routine. As a result, it is easy to detect the stack frame of a routine that satisfies a predetermined condition. Such a predetermined condition may be set by a trade-off effected between the efficiency in the detection of a stack frame and a load imposed by the generation of a record. The record may include a pointer that points at a record that was formed immediately before the record which is first mentioned.
The predetermined condition may be that a routine is not compiled by the compiler. Further, the predetermined condition may be either that a routine is not compiled by the compiler and has a property such that a frequency of the second routine being called is less than or equal to a predetermined value, and such that the generation of the record does not lower the entire performance more than a predetermined level, or that a routine is not compiled by the compiler and may call a routine that is to be compiled by the compiler. Here, what is meant by xe2x80x9ca routine that may call a routine that is to be compiled by the compilerxe2x80x9d is that the routine to be compiled may generate the above described record.
The compile method further comprises steps of: generating, in the codes for the first routine, a code for calling the second routine; and generating a code for deleting a record after the code for calling the second routine.
Garbage collection according to the present invention includes the following processing, that is, in a system which stores, in a stack, stack frames of routines that are executed or being executed if a record (e.g., a JITed last frame record in the preferred embodiment) that points at a stack frame corresponding to a routine that satisfies a predetermined condition and that has been complied by a predetermined compiler (a JIT compiler in the preferred embodiment), following steps are executed to detect a stack frame of a routine compiled by the predetermined compiler: scanning the stack from a stack pointer to a stack frame (the latest record in the preferred embodiment; however, the scanning is not limited to this record) pointed by the record, and detecting a stack frame of a routine that has been compiled by the predetermined compiler; and detecting the stack frame pointed by the record and a stack frame of a routine that has been compiled by the predetermined compiler and that can be sequentially traced from the stack frame pointed at by the record. As a result, the stack frames of all the routines that have been compiled by the predetermined compiler can be detected. The above two steps may be performed in inverted order.
The condition described in the previous paragraph can be employed as the predetermined condition.
Garbage collection according to the present invention may perform the following processing, that is, following steps are executed: determining whether a stack pointer points at a stack frame of a routine that has been compiled by a predetermined compiler; if the stack pointer points at the stack frame of the routine compiled by the predetermined compiler, detecting the stack frame pointed by the stack pointer and a stack frame of a routine that has been compiled by the predetermined compiler and that can be sequentially traced from the stack frame pointed by the stack pointer; and detecting a stack frame pointed by a record and a stack frame of a routine that has been compiled by the predetermined compiler and that can be sequentially traced from the stack frame pointed at by the record. This processing utilizes a property whereby when a stack pointer points at the stack frame of a routine that has been compiled by the predetermined compiler, a stack frame to be detected is not present between the oldest stack frame of the routine that has been compiled by the predetermined compiler and that can be sequentially traced from the stack frame pointed by the stack pointer and the frame pointed by the record.
The above processing further comprises a step of, if the stack pointer does not point at the stack frame of the routine compiled by the predetermined compiler, tracing the stack from the stack pointer to the stack frame pointed by the record, and detecting the stack frame of a routine compiled by the predetermined compiler.
The code for a routine that can be discarded is a code for a routine other than the routine of the stack frames detected as described above. If it is possible to discard, the code for the routine is discarded. The processing for detecting a code for a routine compiled by the predetermined compiler is executed in response that, during the compile by the predetermined compiler, it happens that insufficient memory is available.
The processing performed according to the present invention has been explained. The present invention can be implemented by an apparatus that performs the above processing or by a program that causes a computer to perform the processing. For one having ordinary skill in the art it would be easy to store this program on a memory medium, such as a floppy disk or a CD-ROM, or in another type of storage device.