With the rapid development of the Internet and dramatic growth of WWW applications, Java has become a popular developing and programming language on the Internet. The Java family currently includes three major members: J2ME (Java 2 Micro Edition), J2SE (Java 2 Standard Edition), and J2EE (Java 2 Enterprise Edition). The J2EE, due to its advantages including cross-platform portability, availability of open-source libraries, a huge server-side deployment base, and coverage for most W3C standards, has become very popular in enterprise application development. There are millions of J2EE applications running on J2EE application servers, and more and more J2EE applications are under development.
With the popularity of J2EE applications, the debug and problem determination has become an important issue. Several standards related to this issue have been published, such as “Java Management Extensions (JMX) specification”, “Logging API Specification (JSR47)”, “Monitoring and Management Specification for the Java Virtual Machine (JSR174)”, etc.
The problems discovered at the time of system diagnosis, for example, can be categorized as follows:    (1) Function, or integration misbehavior;    (2) Poor performance;    (3) Crashes;    (4) Hangs;    (5) Memory leaks.
Among the five categories of problems listed above, the latter three are hard to detect. The reason is that they typically appear at high volume condition, or after a long time running. Therefore, it is typically difficult to catch enough information for problem determination. Areas where problems may arise include: JVM (Java Virtual Machine) itself, native node, Java application, system or system source, sub-system (such as database nodes), hardware, and so on.
Some information is available for problem determination in various application server environments conforming to J2EE standards, including:
JavaDump: JavaDump is produced by default when JVM terminates unexpectedly. It summarizes the state of JVM at the instant.
HeapDump: HeapDump is generated at the request of the user. The finer control of the timing of Heapdump can also be specified with Xdump:heap option.
SystemDump: System Dump is also produced by JVM. It contains information about the active process, thread and system memory, and is specified through Xdump:system option.
Trace data: Trace data includes detailed data collected by a running JVM.
Snap trace: Snap trace contains small amount of trace data a running JVM collects, and is similar to the normal trace data.
Profiling: Profiling is a high level log file, which can provide a very detailed records of the activities of the application server.
Garbage collection data: Garbage collection data is produced by JVM with verbose:gc option. It is used to analyze problems in garbage collection of user applications.
Other data available for problem determination include, for example, JIT(Just In Time) data, class loading data and shared classes, and so on.
Although the above information can be used for problem determination, the users still have the dilemma between the cost and usability. On the one hand, the application server may crash/dump unexpectedly in running, and the time cannot be speculated. To find the reason for crash is tough work. Although the system provides some basic information which can be used for problem analysis, including basic log files (e.g., SystemOut.log, etc.) and the above-mentioned dump files, the basic log files cannot provide sufficiently detailed information about the activities of the application server, and thus mere dependence upon these log files is not enough for problem determination in the event of application server crashes, and the normal dump files only contain current status at dump of the application server with no history record for its activities, and thus are not enough for problem determination, either. On the other hand, if the user opens diagnostic log/trace functions to record in detail the activities of the application server, the system performance will degrade significantly, since the enablement of these log/trace functions will occupy a large memory space and decreases the operating speed of the system significantly, and may also cause unexpected problems. In addition, too many log/trace data mean a knowledge explosion, which also makes it difficult for the user to find the problem. Therefore, in a production environment, to open such diagnostic log/trace functions is unpractical.
FIG. 1 illustrates the relationship between the usability of various means for problem determination and their costs. As can be seen from FIG. 1, those means with higher usability have higher costs, and thus are difficult to implement in a production environment.