Product testing ideally will capture all hardware and/or software errors so that they can be corrected prior to product release. However, certain errors such as, for example, memory corruption, memory leaks, resource leaks, etc. are difficult to catch during product testing. Other types of errors may manifest only after prolonged use. When such errors occur, it is imperative to capture the memory image so that debugging can be performed. The act of capturing the memory image is called core dump (i.e. a dump of the memory core). A captured memory image may be examined using standard debugging tools.
In communication equipments with distributed processing ability, there may be multiple service modules. These service modules exist with their own processors and memory. One of these service modules normally controls the equipment and is called a control module. For example, the MGX 8850 switch manufactured by Cisco Systems of San Jose, Calif., can hold up to 12 ATM Switching Modules (AXSM) and two Processor Switching Modules (PXM) on multiple shelves. The Processor Switching Module (PXM) is referred to as a core card, core node or control module. Within a shelf, there is one control module and several service modules. Each module generally comes in a card set which consists of a front card (with its attached daughter card) and one or two back cards (or line modules). The front card contains the processing intelligence and, on the daughter card, the firmware that distinguishes the interface (e.g., OC-48, OC-3, T3, E3, and so on). The service modules interact with each other using a shared bus (e.g., cell bus). Typically, only the control module has a persistent storage (e.g., hard disk).
In normal operation, the service module is controlled by execution of a run time program. The run time program may be a communication program that is loaded or mapped into an area of the memory of the service module. The run time program may use another area of the memory as a data area. Typically, when an error occurs in the service module, an error code is written into an error log in a reserved memory area and the service module is reset. The error log is later examined to determine the cause of the error. Resetting the service module may mean reloading the same run time program into the memory and overwriting the data area with new data, thus making the previous data unavailable for debugging or error analysis. Current error analysis depends on the value of the error code written into the reserved memory.
From the above, it can be seen that there is a need for a technique for capturing the memory image of the service module when an error occurs prior to overwriting the data area of the memory with new data.