1. Field of the Invention
The present invention generally relates to the computer field, particularly to memory dump, and more particularly to a method and apparatus of partitioned memory dump in a software system.
2. Description of Related Art
Memory dumping refers to copying the content of a main memory at a moment to a more durable medium such as a hard disk. It often occurs when a computer program aborts, and records the state of the working memory at this time for the computer program, and therefore can be used to diagnose and debug errors in the computer program.
For a software developer, software defects caused by inadvertent and/or malicious use of memory are very common, and such errors are difficult to diagnose and correct. A severe memory error would result in the halting of the whole software system. Therefore, a global knowledge of the use of the memory is very useful for software analysis and optimization. And for a system administrator, when a plurality of software packages are deployed in the same software runtime system, the memory contention possibly existing between the plurality of software packages will result in that the whole runtime system will be in a unstable state. Therefore, it is helpful for finding out and removing the memory contention among a plurality of software packages by using memory dumping to know the memory state of the runtime system at a certain moment. In conclusion, memory dumping is a very important method for determining and debugging memory problems, especially when the target system crashes and the erroneous code can not be located by other means.
The usual memory dumping is global dumping, i.e. copying out all the content in a memory space. However, such a global dumping has some defects as follows: the first is low efficiency, in that the global dumping needs to consume a lot of resources such as disk space, CPU time, I/O bandwidth, etc, and it is also very resource-consuming activity to analyze a very huge memory dump file. The second is that it is not feasible in some cases; for example, some critical tasks can not be stopped, and when resources such as the disk space, CPU time, I/O bandwidth, etc. are limited the global dumping cannot proceed either; the third is little necessity, in that in many cases, the developer has some judgment as to the range of memory space where the erroneous code is, and therefore only wants to dump the memory space of a narrower range; in addition, only dumping a small portion of the memory space each time also makes for locating the erroneous code more quickly and accurately.
Different from the above global memory dumping, partial memory dumping only dumps a specified portion of a memory space which may be very large. There are currently the following several technical solutions which support the partial memory dump.
The Digital UNIX operating system (see Digital UNIX Kernel Debugging, Part Number: AA-PS2TE-TE, March 1996 from the Digital Equipment Corporation) provides support to the partial and global memory dump, both of which include the physical memory content when the whole computer system crashes. But such a dump scheme is targeted at the whole computer system, and not to an application software system.
U.S. Pat. No. 5,293,612, entitled “Selective dump method and apparatus” discloses a method and apparatus for providing partial memory content dump. The method of the patent uses a page table, a TLB and another unfrozen processor to selectively dump a paged addressable memory. The dumped memory content is also targeted at the whole computer system. US patent application No. 20060041739A1, entitled “Memory dump generation with quick reboot” discloses a method for dumping a physical memory to a secondary storage device and decreasing the delay for rebooting the system. The memory dumping is performed when the system reboots. After the rebooting, the program images in the memory are scanned in an attempt to determine the cause of the system rebooting. If the cause of the reboot can be determined successfully, it will not be necessary to generate a dumped copy of the whole memory for fault analysis. Although the patent aims to lower the overhead of a global memory dumping as much as possible, the method of the patent essentially still needs to generate a global memory dump, and it only analyzes further the generated global memory dump when the system reboots so as to determine whether it is necessary to retain the SSD copy for further fault analysis. In addition, the memory content dumped by the method of the patent is the content of the physical memory and not the virtual memory. On a computer running a modern operating system, the layout of the physical memory is meaningless for an application system. Very professional system software knowledge will be required to analyze the dumped content of the physical memory, which can not be used by an average software developer. Furthermore, the method of the patent needs to reboot the whole operating system, and it is not an online memory dump method, and can not perform local memory dumping at any moment according to the instruction issued by a user.
“Analyzing system dumps using kcrash” in “UnixWare 7 Documentation” describes a tool kcrash for debugging an operating system kernel. The kcrash tool can implement partial memory dump, but the dumped content is that of a physical memory and its granularity is the physical page. Such a dumping is meaningful for the debugging of an operating system kernel, but it is not suitable for an application programmer, because physical memory addresses are worthless for any software system at the user level. In addition, the kcrash tool needs to run when the system is going to crash. Therefore, the technique on which the kcrash is based can not be converted and applied directly to the fault diagnosis of an application software system.
“HP-UX 11i Version 1.5 System Crash Dump White Paper” describes a tool which is approximately identical to the above-mentioned kcrash tool; it is used for fault diagnosis when an operating system crashes, and performs dumping for pages in the physical memory. Such a method and tool can not be used in the fault diagnosis of an application software system.
“SUPER-UX System Administrator's Guide” proposes a Dump Collection method which is also used for fault diagnosis at the time of a system level failure (i.e. operating system crash), and can perform dumping of byte granularity according to the physical memory address range. As mentioned above, such a method and tool can not be used for the fault diagnosis of an application software system either.
“System Dump Analyzer (SDA) Utility” in “QuickSpecs of HP OpenVMS Version 8.3 for Alpha and Integrity Servers” describes a system dump analysis tool provided by an HP OpenVMS multi-user operating system. The main function of the tool is to dump a portion of the main memory at the time of system failure. The dumped memory content can also be saved on a secondary storage medium. The tool provided by HP OpenVMS also performs dumping based on the physical memory content. As mentioned above, what such a dumping provides is the view of the memory for the operating system, and is not a view of the memory for an application software system. The method is still a kind of content copying based on physical addresses, and the dump information generated by it is meaningless for the fault diagnosis of an application software system; and the method can not be used for the dumping of the memory content of any software system directly or after a simple conversion.
Some JVM products support different types of memory dumping. For example, IBM JDK 1.4.2 for z/OS provides the following types of dump: JAVADUPM, SYSDUMP, CEEDUMP, and HEAPDUMP (see IBM, IBM Developer Kit and Runtime Environment, Java 2 Technology Edition, Version 1.4.2, Diagnostic Guide, 8th edition, April 2006). But all these types of dumping are performed on the range of the whole virtual machine process, and a dumping on a portion of the memory content therein is not possible, so it is unhelpful for the debugging of application software running on the virtual machine.
In addition, most software systems can perform memory dumping when they are still alive (i.e., online dumping) by having a specialized and/or built-in dumping agent or by using special hardware. However, without these support, the dumping of the memory content is very difficult, even if not impossible. That is, such a dump method is software system specific and is not general.
It can be seen that in the art, there is a need for a general and application oriented partial memory dump method and apparatus.