1. Field of the Invention
The present invention relates to a profiling technique using sampling and, more specifically, to an information processing device, profile target determining program and method for reducing speed overhead during execution of a program using profiling.
2. Description of Related Art
Many programs dynamically allocate a large number of objects to heap areas in memory. The object format is fixed during program execution in, for example, current Java virtual machine implementation. The fixed object format is the header and data field layout and their size. The memory footprint (memory usage), required memory bandwidth, cache misses, garbage collection (GC) frequency and GC overhead are reduced, and the computer cost for executing the program is also reduced.
The fixed object format is usually changed to reduce the cost and to optimize the program by obtaining information related to access to an object by the profiler during program execution and examining the properties of the object.
For example, when there is a plurality of objects having the property of either not being written or very infrequently being written after initialization (henceforth referred to as “immutable objects”) and the content of these objects is identical, the objects can be merged into a single object. Also, if an object is found which has the property of either not being read or very infrequently being read (henceforth referred to as write-only objects), the object can be compressed. Also, if an array object is found in which some array elements are not accessed (henceforth referred to as unaccessed objects), the unaccessed elements can be deleted. This optimization reduces GC frequency and improves cache utilization by saving memory.
In profiling, objects are sampled because the speed overhead is very high when access to all objects is profiled. The speed overhead is still large even when sampling is performed. Depending on the type of program, the execution speed with profiling is at least 40% slower. When the sampling frequency is simply lowered in order to reduce the speed overhead the profile accuracy declines.
The following is a description of the literature discovered in a prior art search for the present invention.
In Japanese Patent Publication No. 2004-0102597, a technique is disclosed in which source information selected based on profile information or user-specified information is inputted, the source information is parsed, the procedure call-up (sub-procedure) attributes called up by the procedure (main procedure) that appear in the source information are analyzed from the analysis results and stored in a procedure analysis table, the main procedure is marked optimization-not-required if the sub-procedure attributes are not registered in the procedure attribute table during inline expansion, main procedures marked optimization-not-required are removed, and the other procedures are optimized. This technique can automatically prevent optimization when there is the possibility of a problem occurring. However, the technique described in Patent Literature 1 does not improve a profiling technique using sampling. Therefore, it cannot reduce speed overhead due to profiling.
In Japanese Patent Publication No. 2005-0071135, a technique is disclosed in which a process is added to record values and their frequency of occurrence to a maximum of two pairs in intermediate-format data when the frequency of the value being given to a variable inside a procedure during execution is at least 50% to obtain primary profile information, adding a process of recording the frequency of occurrence of two values in primary profile information and the number of times the procedure is executed in intermediate-format data to obtain final profile information, determining a value with a frequency of occurrence of at least 50% with respect to a variable based on the final profile information, optimizing the procedure with respect to this value, and generating the target program. In this technique, the number of occurrences of a value given to a variable in a procedure can be estimated without error. In particular, profile information reliably recording values which exceed a 50% frequency of occurrence can be outputted. However, the technique described in Patent Literature 2 does not improve a profiling technique using sampling. Therefore, it cannot reduce speed overhead due to profiling.
In Japanese Patent Publication No. 2002-0304302, an optimization device for microprocessor object code is disclosed which includes a compiling unit for compiling inputted code, which is a compiling program recorded on a recording medium, using profile data to generate primary object code, and a simulator for simulating the primary object code and generating profile data. The simulator in this optimization device analyzes instruction code in the primary object code generated by the compiling unit, executes the instruction code to execute the process corresponding to the instruction code, detects data code with a high access frequency based on data access information, in which the number of times the data code is accessed during execution of the instruction code is recorded by address and size of the accessed data code, the data code is rearranged in a cache area, which is a data code area accessible by a single instruction, secondary object code is generated, the instruction code in the secondary object code is analyzed, and the instruction code is executed. This technique can improve the execution speed of a program because the data code in the object code is rearranged and the method for accessing the data is optimized even when the data code is out of displacement range. However, the technique described in Patent Literature 3 does not improve a profiling technique using sampling. Therefore, it cannot reduce speed overhead due to profiling.
In Japanese Patent Publication No. 1999-0039167, a technique is disclosed in which a resource assigning unit assigns an internal variable generated by a compiler to a machine resource such as a register or memory, an alias accessibility unit records in assigned resource information whether or not there is a possibility of alias access to a memory access instruction included in a sequence of instructions when an assembler code generating unit outputs a sequence of instructions, and an assembler optimization unit references allocation information and optimizes the assembler level. This technique can ease constraints due to the presence of indirect addressing-type memory access instructions, and can improve execution times and program sizes. However, the technique described in Patent Literature 4 does not improve a profiling technique using sampling. Therefore, it cannot reduce speed overhead due to profiling.