In the digital era, data has become one of the most critical components of an enterprise. As the volume of data is growing exponentially and data breaches are happening more frequently than ever before, detecting and preventing data loss and leakage has become one of the most pressing security concerns for enterprises.
It is challenging for enterprises to protect data against information leakage in the era of big data. As data becomes one of the most critical components of an enterprise, managing and analyzing large amounts of data provides an enormous competitive advantage for enterprises. However, it also puts sensitive and valuable enterprise data at risk of loss or theft and poses significant security challenges to enterprises. The need to store, process, and analyze more and more data together with the high utilization of modern communication channels in enterprises results in an increase of possible data leakage vectors, including cloud file sharing, email, web pages, instant messaging, FTP (file transfer protocol), removable media/storage, database/file system vulnerability, camera, laptop theft, backup being lost or stolen, and social networks.
Data leakage detection faces the following technical challenges. (1) Scalability: the ability to process large content, e.g., megabytes to terabytes, and to be deployed in distributed environments. Scalability is the key to efficiently processing massive enterprise-scale amounts of data. A scalable solution can also reduce the data processing delay and achieve early data leakage detection. (2) Privacy preservation: the ability to preserve the confidentiality of sensitive data. (3) Accuracy: achieving low false negative/positive rates for the detection. The distributed nature of big data environments poses a challenge in accurate leakage detection. (4) Timeliness: immediately detect and respond to data leakage before they cause damage. The volume, variety, and velocity of big data bring both opportunities and challenges for nearly real-time identifying data leakage threats.