The background description provided here is for the purpose of generally presenting the context of the disclosure. Work of the presently named inventors, to the extent it is described in this background section, as well as aspects of the description that may not otherwise qualify as prior art at the time of filing, are neither expressly nor impliedly admitted as prior art against the present disclosure.
Data processing systems may sometimes encounter varied, changing, and/or dirty data. Data errors may occur when a data processing system encounters such data. The data processing system may respond to the data errors in different ways. For example, the data processing system may continue processing when the data errors occur so that a single dirty data record does not stop the processing of the rest of the data by the data processing system. Alternatively, the data processing system may incorporate a fast fail policy to report any condition that is likely to indicate a failure and to stop normal operation instead of attempting to continue a possibly flawed process.
Such data errors may be difficult to understand and debug when performing trial runs of data processing systems during the design phase of the data processing systems. Further, such data errors may be difficult to understand and debug when the data processing systems running in a distributed environment process large datasets. Often these data errors are reported via text in log files. The persons responsible for developing or supporting the data processing systems have to analyze the text in the log files and manually correlate the errors and the input data that caused the errors to the processing steps in the data processing program. Accordingly, correlating errors in this manner can be a laborious, cumbersome, and time-consuming process, which can make debugging and understanding the errors difficult and inefficient.