With increase in adaptation of technology, large amount of data is generated, processed, stored, distributed, and analyzed across various platforms. The data may be generated from several sources comprising private information. The data may be generated from the sources such as data processing devices comprising servers, applications, sensors and networks. The data generated may be used to identify and to analyze a pattern of the data over a period of time. The user may not be aware of the amount of private data accessed by third party sources or applications.
Sharing the data collected from multiple sources to untrusted third party sources might lead to misuse of the private data. Further, sharing the private data with untrusted third party sources may lead to activation of unwanted services or applications and may lead to a breach of privacy. Further, the user may have to be notified about the risk of sharing the private data before sharing the private data to third party sources and applications.
Traditionally, several statistical techniques have been utilized to detect the occurrence of privacy breaches in the data. The statistical techniques used are supervised learning based sensitivity detection that requires customized hardware for generating the data. Further, the supervised learning based sensitivity detection techniques are expensive. Other statistical techniques involve use of rudimentary statistics, generate high false negative alarms, and are prone to errors.