Mass storage subsystems provide increasing storage capacity to meet user demands from host computer system applications. Various storage device configurations are known and used to meet the demand for higher storage capacity whilst maintaining or enhancing the reliability of the mass storage subsystem.
One of the storage configurations that meets demands for increased capacity and reliability is the use of multiple smaller storage modules which are configured to permit redundancy of stored data to ensure data integrity in case of failures. In such redundant subsystems, recovery from many types of failure can be automated within the storage subsystem itself due to the use of data redundancy. An example of such a redundant subsystem is redundant arrays of inexpensive disks (RAID).
Storage subsystems have one or more storage controllers that manage the storage and provide upstream interfaces for I/O to the storage. The storage controller manages an array of storage devices for one or more host systems. The storage controller makes the array of storage devices appear to a host system to be a single, high capacity storage device.
A hardware card that plugs into some form of data communications bus that provides the function of a storage controller is referred to as an adapter card.
Trace tables are data buffers that are filled as a program executes with time-stamped status information about what the program is doing. In the embedded firmware development environment found on storage adapter cards and storage controller systems (both now referred to generally as storage controllers), trace tables are a key instrument to allow problem determination during development and for in field support. In these situations other methods of recording ongoing status information, for example printed outputs to a terminal display, are not available for performance reasons.
Presently trace tables in storage controller systems are stored in non-persistent memory. In storage controller embedded firmware environments, memory usage is very limited and primarily reserved for I/O data path usage.
The use of this memory for the trace tables is limited by the cost of memory and its physical size. This limits the amount of memory usable for the trace table data. This trace table size limitation leads to the trace tables wrapping, sometimes within seconds depending on the system and the load conditions. This can cause valuable trace information to be lost as a problem develops over a given period. The loss of such trace information slows or even prevents problem determination.
A usual solution to the problem of analysis being frustrated by key information in the trace table being overwritten by more recent entries is to devise alternative ways to stop the code earlier. This has the disadvantage that it is time consuming and requires problem recreation which is not always possible.
The trace table data can be transferred or ‘dumped’ to disk during a controlled system shutdown for problem determination but this is not always possible due to memory restrictions. In such conditions, since the trace table data is stored in non-persistent memory the trace table information is lost. The trace table information is also not continuous through the reset, which is potentially when problem determination is required. This can again cause valuable trace information to be lost.
It is an aim of the present invention to provide a method and apparatus for recording trace table information in storage systems in which allows large amounts of trace data to be recorded providing an extended trace history. An extended trace history is required to determine problems that develop gradually over a period of time.