Free availability of storage space, existence of large data networks, and multitudes of sensing devices such as cameras, has helped to spawn the phenomena of big data analysis. Under this paradigm, people, devices, companies, governments and the like tend collect some data for purposes such as surveillance, usage patterns, mapping, etc., and in the process collect as much extraneous data as possible regardless of whether or not the extraneous data is needed for the particular purpose. For example, in a typical credit card transaction, the cardholder's name, address, credit card number, and security pin are all used to verify the identity of the cardholder for purchase authorization. However, the back end processing system may collect other extraneous data in bulk such as the location where the transaction is made, the IP address of purchase, the network provider, etc. After the data is collected in bulk and the relationships between those data recorded as metadata, data mining applications are often used to process these data and/or metadata to answer specific questions for technical or business reasons.
A concern to many consumers is that copies of these data are re-combined, re-packaged, and/or re-sold to other dealers of data whose particular interests are not aligned with the consumers' interests. Privacy concerns of the original consumer arise when these data are replicated across the vast Internet and its datacenters and become immortalized in the computing cloud. Because of the redundancy of the copies, these data and metadata are very difficult to protect, delete, and secure via enforcement of data access constraints.