The present invention relates in general to Java Object corruption issues, and more specifically, to determining the occurrence of a garbage collection (GC) ‘scan-miss’ for a Java object.
Java Object corruption issues are difficult to diagnose and even more difficult to fix. Information that is initially made available (as First Failure Data Capture to the analyzer is minimal, and, usually, multiple iterations of tracing and contextual diagnosis is necessary to identify the root cause and eventually fix it). A Java Object corruption issue typically manifests itself when a certain piece of memory on the Java Heap is expected to hold an object of a certain type, but at the point of access, or profile, ends up pointing to:
1. Garbage (freed memory); or
2. A different object than the one expected; or
3. The middle of an object, as opposed to the beginning of the object.
The reasons for these types of corruption issues are multifold with the causal factor being anywhere in the Just in Time Compiler (JIT) component of the JAVA Virtual Machine (JVM) (JAVA and JVM are a trademarks of Oracle Corp.), thread suspension mechanism in the JVM, operating system efficacy (for example, weak consistency on AIX) (AIX is a trademark of IBM Corp.) or issues in the Garbage Collection itself. With the exception of a bug in the Garbage Collector software, object corruption issues fundamentally boil down to a certain object not being scanned under a certain context by the garbage collector. For example, it could be because of the JIT (as a part of an optimization) placing an object in an area that is traditionally not scanned (e.g., on floating point registers which are not scanned in certain implementations of Virtual Machines), or the object being a part of a thread stack range that is not scanned by the garbage collector.
Therefore, the first step toward diagnosing an object corruption issue is to ascertain whether the invalid reference (or the corrupt object reference) was, indeed, not scanned and retrieve the context under which the object was not scanned, i.e., the scan-miss context. Typically the scan-miss context would encapsulate the following information:
1. After which garbage collection cycle was the invalid object accessed (or the last complete GC cycle before the failure);
2. During which garbage collection cycle was the object not scanned (should normally be the same as above, but not necessarily);
3. At the point of not being scanned/marked where was the object residing:                a. Directly on a thread stack? Corresponding Stack Slot?        b. On a register? Which one?        c. Somewhere else? (Ideally, there is no other place, this would indicate that the object reference was not in any place available for scanning)        