In computer systems, a pipeline is a set of one or more coupled pipeline subsystems that process and/or analyze data. Each pipeline subsystem consists of computer programs or dedicated computers that receive data from a source, process or transform the data, and forward the data to another program, computer or system. Such pipelines can be particularly fragile as any issue encountered at one pipeline subsystem, such as misformatted data, incomplete data, or limited computing resources, can cause the other pipeline subsystems to fail or suffer from degraded performance. Therefore, some form of monitoring would be useful, but existing systems have not provided effective solutions.
Traditional monitoring solutions focus on one of two aspects of monitoring. Event monitoring focuses on historical event data of the pipeline, such as can be found via system log files, job success results, etc. Current status monitoring focuses on a current snapshot of the pipeline, such as the amount of existing disk space, amount of data flowing through the pipeline, amount of available computer processing power, etc. Separating event monitoring from current status monitoring can make it difficult to see and understand the entire context of the health of a pipeline.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
While each of the figures illustrates a particular embodiment for purposes of illustrating a clear example, other embodiments may omit, add to, reorder, and/or modify any of the elements shown in the figures.