Many data sources around us produce high volume streams containing significant amounts of important information for specific applications. Example applications are video surveillance applications ingesting many video feeds to detect potential security breaches. Another example is continuous health monitoring where patients are surrounded with sensors emitting stream data into a stream processing infrastructure that analyzes the data to identify and report medically significant events to medical professionals.
In most of these applications, it is important to track the provenance of every event generated by the system. By provenance, it is meant the origins and justification for the generation of events by the system. For instance, if a medical system suggests that a patient requires a drug dosage change, based on its analysis, the provenance of such an event would inform the medical professionals of the procedure and all the data points used for the generation of that alert.
Typically, these provenance reports are manually obtained by leveraging data specified by developers during the design of their analysis.