Large organizations often make use of multiple big data systems that synthesize and process data and then distribute that data to one another. Often, when a problem occurs with a particular set of data, it may be propagated throughout the organization before the error is identified and/or before its existence can be communicated to system users. This can affect, not only the use of the data itself, but the use of derivative data as well. Even if the users of the originating data production system become aware of the problem, they may not know who to notify because they do not know which downstream systems have accessed the data.