This specification relates to data lineage analysis.
Data lineage analysis generally refers to the practice of analyzing, for a given piece of data, the creation of the piece of data, updates to the piece of data, and other pieces of data that may have been derived from that piece of data. For example, a data lineage analysis system may attempt to identify other pieces of data that a given software process wrote after reading a given piece of data. As another example, the data lineage analysis system may attempt to identify the software process that created a given piece of data and other software processes that subsequently wrote or read the piece of data.