1. Field of the Invention
Embodiments of the invention relates generally to methods and systems for detecting and repairing corrupted critical data structures without an operational interruption, and in particular to methods and systems for detecting and repairing corrupted critical data structures of Input/Output queues without operational interruption of an Input/Output module.
2. Description of Related Art
Data corruptions occurring in critical data structures such as input/output queues result in system hangs and outages. In many instances, corrupted internal data structures result in total program failure. In prior systems, redundant hardware components were used to provide additional resiliency to avoid this problem of system failure. However, in some low-end servers, it is cost-prohibitive to have redundant hardware components. Furthermore, in some server configurations, even a main hardware component has been completely eliminated and replaced with a software component that emulates the hardware component. An example of a software component emulating a hardware component is the Input Output Unit (IOU), a component of the IO Module or the Resource Management Module architecture used in certain mainframe servers.
In systems where there are no redundant hardware components, without a technique that would automatically detect data corruption of critical data structures and repair the corrupted critical data structures without interrupting the operation of the module in which the critical data structures are being used, system hangs and outages as a result of data corruption are unavoidable. Thus, it is desirable to have such a technique.