1. Technical Field
The present invention relates generally to an improved data processing system and in particular to a method and apparatus for managing data in a configurable data processing system. Still more particularly, the present invention provides a method and apparatus for reducing the amount of data collected for analyzing errors in a configurable data processing system.
2. Description of Related Art
A logical partitioning option (LPAR) within a data processing system (platform) allows multiple copies of a single operating system (OS) or multiple heterogeneous operating systems to be simultaneously run on a single data processing system platform. A partition, within which an operating system image runs, is assigned a non-overlapping sub-set of the platform""s resources. These platform allocable resources include one or more architecturally distinct processors with their interrupt management area, regions of system memory, and I/O adapter bus slots. The partition""s resources are represented by its own open firmware device tree to the OS image.
Each distinct OS or image of an OS running within the platform are protected from each other such that software errors on one logical partition cannot affect the correct operation of any of the other partitions. This is provided by allocating a disjoint set of platform resources to be directly managed by each OS image and by providing mechanisms for ensuring that the various images cannot control any resources that have not been allocated to it. Furthermore, software errors in the control of an OS""s allocated resources are prevented from affecting the resources of any other image. Thus, each image of the OS (or each different OS) directly controls a distinct set of allocable resources within the platform.
These partitions each have one or more processors associated with them. When an error, such as a system checkstop, occurs, a common service processor (CSP) function is employed to perform what is called a scan dump routine. When invoked, this routine collects data, such as, all possible scan rings, array data, and trace arrays. This data is stored in a nonvolatile random access memory (NVRAM) for later analysis. As systems become more complex, more data is needed for error analysis. As a result, room in the NVRAM is used up quickly. To gain more space, persistent storage, such as a hard drive, may be employed. Sometimes, even that space is insufficient. Another problem is the amount of time needed to collect the information increases.
Therefore, it would be advantageous to have improved method, apparatus, and computer implemented instructions for collecting data used in error analysis.
The present invention solves these problems by providing a method, apparatus, and computer implemented instructions for processing an error in a multiprocessor data processing system. An error is detected within the data processing system. A chip, causing the error, is identified within a plurality of chips to form an identified chip. Data is collected from the identified chip and hardware associated with the identified chip.