1. Field
Embodiments of the present invention generally relate to the field of Internet communication. In particular, various embodiments relate to a method and system for data leak protection in upper layer protocols.
2. Description of the Related Art
The digitization of information stored in an organization, such as an enterprise, has increased over the years. In addition, the distribution of content via networks has also begun to grow through information infrastructures such as the Internet. The Internet speeds the communication process; however it also makes it much easier to intentionally or accidentally send corporate/personal confidential documents and/or sensitive information to an unauthorized receiver. To prevent data leak, a firewall may be deployed at a border of a private network. Multiple sensors may be configured by a network administrator of the network for defining formats of sensitive information, including, but not limited to, credit card numbers, social security numbers (SSNs), IP addresses, user names/passwords. The firewall may catch a file or a message that is sent out of the private network and then detect if the file or message contains any text that matches the formats defined by the sensors. If the file or message contains any sensitive information, the firewall may take an action defined in the sensors, such as block the message/file from transmission to its destination.
A firewall may check files or messages that are transferred via some email transfer protocols or file transfer protocols, such as post office protocol (POP), simple mail transfer protocol (SMTP), instant message access protocol (IMAP), file transfer protocol (FTP) and the like. However, when malicious software (malware), e.g., viruses, spyware, worms, trojans, rootkits, keyloggers and the like, has stolen sensitive information, e.g., credit card numbers, from a Point of Sale (POS) terminal/server, for example, such malware could use a request, command or method of an upper layer protocol that is not intended to be used to transfer messages and/or files to send the data to a hacker. For example, if malware has stolen a credit card number 8888 8888 8888 0001. The malware may send a crafted domain name system (DNS) query with “8888888888880001.com” encoded within the QNAME field of the DNS question portion of the DNS packet to a compromised DNS server. The compromised DNS server may then parse the credit card number from the DNS query and send it to the hacker.
DNS uses User Datagram Protocol (UDP) or Transmission Control Protocol (TCP) as the protocol transport to serve requests clients and issue replies. Because existing Data Leak Prevention (DLP) engines usually check only messages or files that are transferred out of a private network via specific protocols (e.g., message or file transfer protocols), the exemplary DNS query presented above, containing encoded information regarding a credit card number, will not trigger traditional DLP checking. Another way that malware may transfer the same credit card number without trigger a DLP check is by sending a hypertext transfer protocol (HTTP) GET request (HTTP://www.hacker.com/8888888888880001) to a compromised web server. The logs of the web server would then contain a record evidencing a request to access that Universal Resource Locator (URL) even though it doesn't exist. This allows hackers to bypass corporate DLP systems and gather sensitive information. A further way to bypass prior art DLP systems is by using an authentication process of an upper layer protocol. Malware may send an authentication request of an upper layer protocol and use the credit card number as a user name or password to a compromised server. The server on the other end may then log the credit card numbers that were used as usernames or passwords.
In general, malware may send sensitive information through requests, commands and/or methods of upper layer protocols that are not intended to be used to transfer messages and/or files to compromised servers. As such requests, commands and/or methods are usually used for setting up a connection with a server before a session is actually created or carrying out operations on the server side, for example, such requests, commands or/or methods will not trigger traditional DLP checking and therefore represent a risk for leaking sensitive information.