1. Field of the Invention
This invention generally relates to methods, systems and computer program products for detecting at least one of security threats and undesirable computer files.
2. Background Art
The following references may be cited herein:    [Ahsan00] K. Ahsan. Covert Channel Analysis and Data Hiding in TCP/IP. Master's Thesis, University of Toronto, 2000.    [Ahsan02] K. Ahsan and D. Kundur. Practical Data Hiding in TCP/IP. Proceedings of the ACM Workshop on Multimedia Security, December 2002.    [Axelsson00] S. Axelsson. The Base-rate Fallacy and the Difficulty of Intrusion Detection. ACM Transactions on Information and System Security, 3(3):186-205, August 2000.    [Barford98] P. Barford, A. Bestavros, A. Bradley, and M. Crovella. Changes in Web Client Access Patterns: Characteristics and Caching Implications. BU Computer Science Technical Report, BUCS-TR-1998-023, 1998.    [Bemers96] T. Bemers-Lee, R. Fielding, and H. Frystyk. Hypertext Transfer Protocol-HTTP/1.0. Internet Engineering Task Force, May 1996. RFC 1945.    [Borders04] K. Borders and A. Prakash. Web Tap: Detecting Covert Web Traffic. Proceedings of the 11th ACM Conference on Computer and Communications Security (CCS), Washington, D.C., October 2004.    [Borders07] K. Borders, A. Prakash, M. Zielinski. Spector: Automatically Analyzing Shell Code. Proceedings of the 23rd Annual Computer Security Applications Conference (ACSAC), Miami, Fla., December 2007.    [Brand85] Sheila L. Brand. DoD 5200.28-STD Department of Defense Trusted Computer System Evaluation Criteria (Orange Book). National Computer Security Center, December 1985.    [Brumley06] D. Brumley, J. Newsome, D. Song, H. Wang, and S. Jha. Towards Automatic Generation of Vulnerability-based Signatures. Proceedings of the 2006 IEEE Symposium on Security and Privacy, pp. 2-16, 2006.    [Brumley07] D. Brumley, J. Caballero, Z. Liang, J. Newsome, and D. Song. Towards Automatic Discovery of Deviations in Binary Implementations with Applications to Error Detection and Fingerprint Generation. Proceedings of the 16th USENIX Security Symposium, Boston, Mass., August 2007.    [Caballero07] J. Caballero, H. Yin, Z. Liang, and D. Song. Polyglot: Automatic Extraction of Protocol Message Format Using Dynamic Binary Analysis. Proceedings of the 14th ACM Conference on Computer and Communications Security (CCS), Washington, D.C., October 2007.    [Cabuk04] S. Cabuk, C. Brodley, and C. Shields. EP Covert Timing Channels: Design and Detection. Proceedings of the 11th ACM Conference on Computer and Communications Security (CCS), Washington, D.C., October 2004.    [Cadar06] C. Cadar, V. Ganesh, P. Pawlowski, D. Dill, D. Engler. EXE: Automatically Generating Inputs of Death. In Proc. of the 13th ACM Conference on Computer and Communication Security, 2006.    [Castro06] S. Castro. How to Cook a Covert Channel. hakin9, January 2006.    [Christodorescu05] M. Christodorescu, S. Jha, S. Seshia, D. Song, and R. Bryant. Semantics-aware Malware Detection. Proceedings of the 2005 IEEE Symposium on Security and Privacy, May 2005.    [Cid08] D. Cid. OSSEC Open Source Host-based Intrusion Detection System. April 2008.    [Dingledine04] R. Dingledine, N. Mathewson, and P. Syverson. Tor: The Second-generation Onion Router. Proceedings of the 13th USENIX Security Symposium, August 2004.    [Duska97] B. Duska, D. Marwood, and M. J. Feeley. The Measured Access Characteristics of World Wide Web Client Proxy Caches. Proceedings of USENIX Symposium on Internet Technology and Systems, December 1997.    [Dyatlov03] A. Dyatlov and S. Castro. Exploitation of Data Streams Authorized by a Network Access Control System for Arbitrary Data Transfers: Tunneling and Covert Channels Over the HTTP Protocol. June 2003.    [Dyatlov08] A. Dyatlov and S. Castro. Wsh ‘Web Shell’. March 2008.    [Fisk02] G. Fisk, M. Fisk, C. Papadopoulos, and J. Neil. Eliminating Steganography in Internet Traffic with Active Wardens. Proceedings of the 5th International Workshop on Information Hiding, October 2002.    [Foster02] J. Foster, T. Terauchi, and A. Aiken. Flow-sensitive Type Qualifiers. Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), Berlin, Germany, June 2002.    [Garfinkel03] T. Garfinkel, B. Pfaff, J, Chow, M. Rosenblum, D. Boneh. Terra: a Virtual Machine-based Platform for Trusted Computing. Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles (SOSP), Bolton Landing, N.Y., October 2003.    [Giles03] J. Giles and B. Hajek. An Information-theoretic and Game-theoretic Study of Timing Channels. IEEE Transactions on Information Theory, 48:2455-2477, September 2003.    [Gligor93] V. Gligor. A Guide to Understanding Covert Channel Analysis of Trusted Systems. National Computer Security Center Technical Report, NCSC-TG-030, Ft. George G. Meade, M D, November 1993.    [Gray94] J. Gray III. Countermeasures and Tradeoffs for a Class of Covert Timing Channels. Hong Kong University of Science and Technology Technical Report, 1994.    [Heinz04] F. Heinz, J. Oster. Nstxd-IP Over DNS Tunneling Daemon. March 2005.    [Kelly02] T. Kelly. Thin-Client Web Access Patterns: Measurements From a Cache-busting Proxy. Computer Communications, 25(4):357-366, March 2002.    [Kruegel03] C. Kruegel and G. Vigna. Anomaly Detection of Web-based Attacks. Proceedings of the 10th ACM Conference on Computer and Communications Security (CCS), Washington, D.C., October 2003.    [Microsoft08] Microsoft Corporation. BitLocker Drive Encryption: Technical Overview. April 2008.    [Netwitness08] NetWitness Corporation. NetWitness-Total Network Knowledge. April 2008.    [Newsome05] J. Newsome and D. Song. Dynamic Taint Analysis for Automatic Detection, Analysis, and Signature Generation of Exploits on Commodity Software. Proceedings of the 12th Annual Network and Distributed System Security Symposium (NDSS), San Diego, Calif., February 2005.    [Nguyen-Tuong05] A. Nguyen-Tuong, S. Guarnieri, D. Greene, J. Shirley, and D. Evans. Automatically Hardening Web Applications Using Precise Tainting. Proceedings of the 20th IFIP International Information Security Conference, Makuhari Messe, Chiba, Japan, June 2005.    [Niksic98] H. Niksic. GNU Wget.—The Noninteractive Downloading Utility. September 1998.    [NSA08] National Security Agency. Security-enhanced Linux. April 2008.    [Oberheide07] J. Oberheide, E. Cookie, and F. Jahanian. Rethinking Antivirus: Executable Analysis in the Network Cloud. Proceedings of the 2nd USENIX Workshop on Hot Topics in Security (HOTSEC '07), Boston, Mass., August 2007.    [Oscar08] OSCAR Protocol for AOL Instant Messaging. April 2008.    [Paxson98] V. Paxson. Bro: A System for Detecting Network Intruders in Real-time. Proceedings of the 7th USENIX Security Symposium, January 1998.    [Paxson00] Y. Zhang and V. Paxson. Detecting Backdoors. Proceedings of the 9th USENIX Security Symposium, August 2000.    [Proctor07] P. Proctor, R. Mogull, and E. Quellet. Magic Quadrant for Content Monitoring and Filtering and Data Loss Prevention. Gartner RAS Core Research Note, G00147610, April 2007.    [Richardson07] R. Richardson. CSI Computer Crime and Security Survey. 2007.    [Roesch99] M. Roesch. Snort—Lightweight Intrusion Detection for Networks. Proceedings of the 13th USENIX Systems Administration Conference (LISA), Seattle, Wash., 1999.    [Roshal08] A. Roshal. WinRAR Archiver, a Powerful Tool to Process RAR and ZIP Files. April 2008.    [RSA07] RSA Security Inc. RSA Data Loss Prevention Suite—Solutions Brief. 2007.    [Sailer04] R. Sailer, X. Zhang, T. Jaeger, and L. Doorn. Design and Implementation of a TCG-based Integrity Measurement Architecture. In Thirteenth Usenix Security Symposium, pp. 223-238, 2004.    [Sandvine08] Sandvine, Inc. Sandvine—Intelligent Broadband Network Management. April 2008.    [Servetto01] S. Servetto and M. Vetterli. Communication Using Phantoms: Covert Channels in the Internet. Proceedings of the IEEE International Symposium on Information Theory, June 2001.    [Vigna04] G. Vigna, W. Robertson, and D. Balzarotti. Testing Network-Based Intrusion Detection Signatures Using Mutant Exploits. Proceedings of the 11th ACM Conference on Computer and Communications Security (CCS), Washington, D.C., October 2004.    [Vontu08] Vontu, Inc. Vontu—Data Loss Prevention, Confidential Data Protection. April 2008.    [Winzip08] WinZip International LLC. WinZip—The Zip File Utility for Windows. April 2008.    [Wray91] J. Wray. An Analysis of Covert Timing Channels. Proceedings of the 1991 IEEE Symposium on Security and Privacy, Oakland, Calif., May 1991.    [Zimmerman95] P. R. Zimmermann. The Official PGP User's Guide. MIT Press, 1995.
As the size and diversity of the Internet grows, so do the applications that use the network. Originally, network applications such as web browsers, terminal clients, and e-mail readers were the only programs accessing the Internet. Now, almost every application has a networking component, whether it is to obtain updates, manage licensing, or report usage statistics.
Although pervasive network connectivity provides a number of benefits, it also introduces security risks. Many programs that access the network allow users to leak confidential information or expose them to new attack vectors. An example is instant messaging (IM) software. Most IM programs permit direct file transfers. Also, so-called IM viruses are able to circumvent security systems by going through the IM network itself. Peer-to-peer file sharing software presents a risk as well because files often come packaged with Trojan horse malware. These unwanted applications are not outright malicious and therefore not detected by conventional security software, but they can still pose a serious threat to system security.
In addition to unwanted applications, many programs that directly harm their host computers communicate over the network. The resulting malware traffic may contain sensitive information, such as log-in names, passwords, and credit card numbers, which were collected from the host. This traffic may also have command and control information, such as instructions to download other malicious programs or attack other computers.
Identifying web applications that are running on a computer and differentiating them from one another is essential to improving overall network security and visibility. Furthermore, doing so with a network monitoring system introduces minimal overhead and ensures that the security system itself is isolated from attack.
As the Internet grows and network bandwidth continues to increase, administrators are faced with the task of keeping confidential information from leaving their networks. Today's link speeds and traffic volume are such that manual inspection of all network traffic would be unreasonably expensive. Some security solutions, such as intrusion prevention systems and anti-virus software, focus on protecting the integrity of computers that house sensitive information. Unfortunately, these approaches do not stop insider leaks, which are a serious security threat. In the latest 2007 CSI/FBI survey of computer crimes, insider abuse ranked above virus outbreaks as the most prevalent security threat with 59% of respondents having experienced insider abuse [Richardson07].
In response to the threat of insider leaks, some vendors have provided data loss prevention (DLP) systems that inspect outgoing traffic for known confidential information [Vontu07, RSA08]. Although these systems may stop naïve adversaries from leaking data, they are fundamentally unable to detect the flow of encrypted or obfuscated information. What remains is an almost completely wide-open pipe for leaking encrypted confidential information to the Internet.
Traditional threat detection approaches involve directly categorizing and identifying malicious activity. Examples of this methodology include anti-virus (AV) software, intrusion detection systems (IDSs), and data loss prevention (DLP) systems. These systems rely on blacklists that specify undesirable programs and network traffic. Blacklists have a number of benefits. First, when some malicious activity matches a signature on a blacklist, an administrator immediately knows the nature of the threat and can take action. Second, many blacklists (those for IDSs and AV software) are globally applicable and require little tuning for their target environment (e.g., a known computer virus is unwanted in any network). Widespread applicability also goes hand in hand with low false-positive rates; activity that matches a blacklist is usually not of a legitimate nature. These advantages, along with the simplicity and speed of signature matching, have made blacklisting the most prevalent method for threat detection.
Despite its benefits, blacklisting suffers from fundamental limitations that prevent it from operating effectively in today's threat environment. One limitation is that a blacklist must include profiles for all unwanted activity. Malicious software (malware) is now so diverse that maintaining profiles of all malware is an insurmountable task. Research shows that even the best AV software can only detect 87% of the latest threats [Oberheide07]. Furthermore, a hacker who targets a particular network can modify his or her attack pattern, test it against the latest IDS and AV signatures, and completely avoid detection, as is demonstrated in. [Vigna04].
The following U.S. patent documents are related to the present invention: U.S. Pat. Nos. 6,519,703; 6,671,811; 6,681,331; 6,772,345; 6,708,212; 6,801,940; and U.S. Publication Nos. 2002/0133586; 2002/0035628; 2003/0212903; 2003/0004688; 2003/0051026; 2003/0159070; 2004/0034794; 2003/0236652; 2004/0221191; 2004/0114519; 2004/0250124; 2004/0250134; 2004/0054925; 2005/0033989; 2005/0044406; 2005/0021740; 2005/0108393; 2005/0076236; and 2007/0261112.