The present invention relates to a technique for calling and executing a program on a computer, and more particularly, to profiling.
For a programming language processor or execution system used in a server environment, there have been conventionally used dynamic script languages such as PHP and more static programming languages such as Java®. Recently, in order to easily call class assets of Java® from PHP or the like, it becomes popular to implement dynamic script languages on a static programming language such as Java®.
Especially, P8 and Quercus as PHP, JRuby as Ruby, Jython as Python, Groovy and the like, which are running on a Java® virtual machine, are known.
The speed of such a dynamic script language processor on Java®, however, is often slower than native implementation. One of the reasons for the reduced speed is that method inlining of Java's JIT compiler cannot be used because of the dynamic type of the system.
FIG. 1 shows an example of a typical handler invocation path in a dynamic script language processor on Java®. In FIG. 1, a virtual call x.add exists on the way of being called from FuncA. To which real method (processing handler) this will jump depends significantly on a called context. That is, x.add is a truly polymorphic call.
In order to inline an appropriate processing handler (in this example, DInt.add) at the time of JIT-compiling of FuncA, profiling information which includes calling-context information is required such as “When a call is made in the order of FuncA->Handler.add->Operator.add, the virtual call x.add jumps to DInt.add.”
Accordingly, it has been conventionally common to perform profiling, and inline a code determined to be appropriate on the basis of the information.
FIG. 2 shows an example in which DInt.add is appropriately inlined into FuncA with the use of obtained profiling information.
The followings are conventional techniques for acquiring profiling information.
Japanese Patent Laid-Open No. H09-62544 discloses a profiling method in which, in order to grasp the behavior of a program for a parallel computer described in an original source code in a sequential form, an instrumentation code is inserted into a converted code of the program optimization-converted by a compiling process, the instrumentation code instructing measurement of profile data at the time of executing the program, the method being configured so that a profile initialization process for collecting the original source code information of the program is performed before the optimization conversion by the compiling process.
Japanese Patent Laid-Open No. H11-212837 discloses a configuration including: a static analysis/profile processing inserting section detecting function calls in a source program, storing them into a database by attaching an identification number for each type of call pair, inserting profile processing into the source program for each function call, and setting an area for a table storing the number of function calls for each identification number, in order to reduce a memory area and overhead at the time of performing the profile processing; a compiling/program executing section compiling and executing the profile processing included source program and, when the profile processing is performed, incrementing the number of function calls in the table with an identification number corresponding to the type of call pair as an index; and a profile information integrating section generating information about the number of calls for each type of call pair by reading the number of function calls in the table for each identification number.
Japanese Patent Laid-Open No. 2002-132542 discloses that: in order to reduce the size of a table required for collecting call pair information related to dynamic function calls in a program, dynamic caller functions and dynamic callee functions are picked up, each of which is given a function ID; a profile processing inserting section generates a profile processing included source program which includes processing for securing an area for a dynamic call pair information storage table storing, for each dynamic call pair which is a pair of the function ID of a dynamic caller function and the function ID of a dynamic callee function, the number of calls of the dynamic call pair, and dynamic call profile processing; and a compiling/program executing section compiles and executes the source program and collects the number of calls for each of call pairs related to dynamic calls using the table.
However, the profiling processes of these conventional techniques are not suitable to be applied to a processing handler of a dynamic script language processor in terms of the cost for profiling and comprehensibility.
Accordingly, a simple method is conceivable in which stack traversing is performed as necessary at the time of profiling. However, the cost of this method is high.
A method of temporarily embedding a code for profile at the time of upgrading/compiling can be applied only to a part of methods which are executed very frequently.
In “Adaptive Online Context-Sensitive Inlining” by Kim Hazelwood and David Grove, Code Generation and Optimization, 2003, International Symposium on Code Generation and Optimization, 23-26 Mar. 2003, a profiling process by sampling is disclosed. In this method, however, collected information is not comprehensive. Furthermore, especially processing handlers of a dynamic script language processor are leaf methods with a small size, and therefore, sampling cannot be performed well. Furthermore, since stack traversing, which is heavy processing, is required, the sampling rate cannot be high.
In “A comparative study of static and profile-based heuristics for inlining” by Matthew Arnold, Stephen Fink, Vivek Sarkar and Peter F. Sweeney, ACM SIGPLAN Notices, Volume 35, Issue 7 (July 2000), Pages 52-64, a profiling process using a call graph is disclosed. However, in the case where calls join and separate as shown in FIG. 1, it is not possible to correctly determine an inlining destination.
In “Accurate, efficient, and adaptive calling context profiling” by Xiaotong Zhuang, Mauricio J. Serrano, Harold W. Cain and Jong-Deok Choi, Conference on Programming Language Design and Implementation Proceedings of the 2006 ACM SIGPLAN, a technique is disclosed which enables correct determination by creating a calling context tree. In this case, an additional process for creating the tree and a memory for holding the tree are required.
In “HPROF: A Heap/CPU Profiling Tool in J2SE 5.0”, http://java.sun.com/developer/technicalArticles/Programming/HPROF.html, HPROF which is a profiler using JVMTI is described. Though this profiler is capable of acquiring comprehensive information, the speed is slow. Furthermore, a memory for holding a log and post-processing for aggregation of context information are required.
According to the technique disclosed in “Probabilistic calling context” by Michael D. Bond and Kathryn S. McKinley, Proceedings of the 22nd annual ACM SIGPLAN Conference on Object Oriented Programming Systems Languages and Applications, Pages 97-112, 2007, the memory for holding data can be minimized, but it is difficult to know online which path is used. Therefore, the technique cannot be used for inlining.