The following description relates to the management of tasks in a data processing environment.
A fault in a data processing environment can result in a range of problems of differing severity. For example, a fault in a processor can cause the processor to cease execution or continue executing but yield incorrect results. In some circumstances, a fault can cause one or more nodes of a data processing system to fail, thereby crippling or incapacitating the system.
The fault tolerance of a data processing environment is the ability of the environment to abide a fault. For example, a fault tolerant data processing system can, under certain circumstances, prevent itself from crashing or a data processing landscape can continue to process data even in the event of a malfunctioning or incapacitated node. Fault tolerance also may include the ability of a data processing system to behave in a well-defined manner when a fault occurs. Fault tolerance can be achieved, e.g., by masking a faulty component or by performing responsive or corrective measures upon detection of a fault.
A data processing environment can provide fault tolerance using different systems and techniques. For example, fault tolerance can be implemented using redundant elements, recovery based on failure semantics, and group failure masking. Redundancy is the duplication of system elements (e.g., system data, system hardware components, and/or system data processing activities) to prevent failure of the overall data processing environment upon failure of any single element. Recovery based on failure semantics includes recognizing a failure of a system element based on a description of the failure behavior of that system element. Group failure masking, another fault tolerance technique, includes masking a failure using a group of nodes. For example, multiple instances of a particular data processing server can run on different nodes of a data processing environment. If one node becomes unreachable due to a failure or even a delay in the transfer of data within the data processing environment, a second node that runs a second instance of that data processing server can provide the service.