This invention relates to a method and apparatus for debugging faults occurring in a router or other network device and more particularly to compressing core file and storing the compressed core file into an internal flash memory.
Network servers and other types of network devices often experience unrecoverable faults. One example of an unrecoverable fault occurs when a routine writes an invalid address value into core memory. When a process tries to access the illegal address value, a fault occurs. For example, a process may request a memory address for a status register used for conducting a direct memory access (DMA) operation. If the memory address is invalid, a fatal error occurs when the process attempts to access the memory address, which causes the router to reset.
Viewing core files is vital to resolving fatal fault errors. A core file is essentially a copy of DRAM which contains the program, program pointers, program variables, etc. The core file provides a snap-shot of the router at the time the fault occurred. DRAM is used to meet performance requirements of the system and since the contents of the DRAM are destroyed after a reset operation, the core file must be downloaded to another storage device. Routers can be equipped with some flash memory. However, due to the cost of flash memory, the flash memory is not large enough to hold all DRAM contents. Thus, the core file must be downloaded to an external server connected to the router through a local area network (LAN). The core file can then be analyzed by an engineer from a computer or workstation to identify the source of the fault.
The problem with copying a core file to an external device is that the fault condition causing the router to shutdown may be caused by a process that must be operational in order to download the core file. For example, the fault may be caused by a software error with a network protocol or LAN media drivers. If these network interface processes are not operational, the core file cannot be successfully downloaded to an external network device. Thus, in the past, a special image had to be created in order to investigate the fault. The special image is produced by modifying operating code to print out specific identified information before the fault occurs. Generating special images to locate faults requires a large amount of trial and error which is extremely time consuming. Alternatively, the router is taken out of production so that the current content of the main memory can be analyzed with a ROM monitor.
Accordingly, a need remains for a faster more reliable way to save core file after a fault condition occurs in a network device.