Shared large data sources, e.g., on a cloud, can be subject to various operations being applied repeatedly on them for different purposes. Sometimes, regenerating results for certain operations on large data is more costly than storing them for later retrieval in terms of time and storage costs. Existing systems can support one-off runs of analytics on data. In such systems, data reuse must be manually determined by a user, and it is difficult to accomplish data re-use between multiple users.