1. Technical Field
The present invention relates generally to program performance management, and particularly to estimating the effect on performance of a Java program due to modification of the program.
2. Description of Related Art
In recent years, the use of the Java platform, a product available from Sun Microsystems, Inc., has greatly increased. Particularly, with the rise of the Internet, the Java programming language became a popular language used by programmers for various types of applications, such as Web applications, enterprise applications, etc. The reason behind this popularity is the characteristics that Java programming language provides, for example, platform-independence and multi-threading support.
Platform-independence is enabled in Java through the use of a Java Virtual Machine (JVM). The JVM first translates a program written in the Java programming language into standard bytecodes, as defined in the JVM specifications. When the program runs, the JVM interprets the bytecodes and executes each bytecode. Another technology that was introduced in Java is “Just-in-Time” (JIT) compilation. In this case, the bytecodes are compiled into native code before they are executed. The native code is comprised of machine instruction that are specific to the platform in which the JVM is running. Thus, any computer platform may run Java applications as long as it contains a Java runtime environment, which interprets the bytecode and optionally compiles them into native code at runtime (hence the name) for a specific operating system.
The JVM specification uses garbage collection for memory management. The Java programming language enables programmers to create objects without having to destroy them explicitly when they are no longer needed. The language ensures that memory allotted to unused objects will be reclaimed eventually and put back to the heap. An object is unused if there are no other objects that reference it. The heap is a portion of the JVM runtime environment from where memory needed to create an object is taken. Thus, the JVM maintains the free memory in the heap so it knows where to get them when needed. The JVM has a default size for a heap although users can specify the minimum and maximum size when invoking the JVM runtime. The heap starts at a minimum size and it continues to grow when more and more memory is needed by the program. However, it cannot exceed the maximum. The heap can also shrink in size when it is determined that it is too big and there are not a lot of objects being created, and when it has been specified that heap shrinking is allowed to happen.
Garbage collection (GC) is an event that takes place when an object needs to be created but there is not enough free memory to create that object. A garbage collector thread executes to detect all unused objects and return them to the heap as free memory. The goal is to be able to collect enough free memory for the object that is being created. Garbage collection events suspend all running threads (except the garbage collection thread). Garbage collection events can be a performance bottleneck when they happen very frequently since during this time, no other threads can run and therefore no actual work can be done, thus, reducing throughput and increasing response time.
There are two major concerns with garbage collection: (1) the duration of a GC event—the longer a GC event takes place, the longer the other threads are suspended. Thus, a very long GC event can be noticeable through poor response time. The duration of a GC event depends on the footprint and the size of the heap. The footprint is the amount of used memory (or active objects); (2) the frequency of GC events—the more often a GC event takes place, the more time threads are being suspended and therefore throughput is reduced. The frequency of GC events depends also on the footprint, size of the heap and the allocation rate, that is, how fast is the program creating objects.
Programmers often attempt to reduce this burden on the system by minimizing GC events which can be achieved by creating objects conservatively. Having said this, the use of objects is a key factor for optimizing performance of Java programs. The use of objects affects the footprint required for an application. The smaller the footprint, the better the performance. Thus, the questions often asked are: Can we reduce the footprint of a Java application? How can we reduce it? Do we know how much improvement we will get if we reduce the footprint? The answer to the first question is easy to determine as programmers can investigate in their programs if they can still optimize the use of objects. For the second question, there are known ways to reduce footprint—object pools, object reuse, etc. For the last question, the actual benefit manifests itself in terms of improvement in throughput, more so than response time. however, quantifying the actual benefit can be done only by actually rewriting the program to reduce footprint, running the program, and comparing the throughput with previous runs. Thus, programmers use a ‘trial and error’ approach in optimizing garbage collection. A utility called verbosegc in Java allows programmers to gather garbage collection statistics such as the number of garbage collection events, duration of the garbage collection, etc. However, there is no systematic way to determine change in performance immediately without modifying the program and obtaining measurements until a desired result is reached. This trial and error approach can be time consuming, as a program may need to be modified several times and tested each time to see if the performance target has been reached.
Therefore, it would be advantageous to have an improved method and apparatus for determining a close approximate benefit of reducing footprint of a Java application in a systematic manner without using an iterative ‘trial and error’ approach. This is very useful in situations where there is a target performance throughput and so by knowing the gap between the current throughput and the target throughput, getting an idea of how much reduction in memory footprint is needed to close the gap will facilitate the whole process.