1. Field of Art
The present invention generally relates to the field of content fingerprinting with application, for example, to the areas of data leakage prevention, anti-spam, URL filtering, and malicious code detection (anti-malware).
2. Description of the Related Art
The present invention may be applied to various areas of computer and information security, including the areas of data leakage prevention, malicious code detection (anti-malware), anti-Spam, and URL filtering, for example.
Data leakage prevention systems are becoming more important for enterprise computing systems. Serious information leakage accidents have caused substantial losses and have damaged corporate images. Such accidents currently occur one after the other. In addition, regulations promulgated by governments require enterprises to properly protect their digital information from leaking.
Computer viruses, worms, Trojans, rootkits, and spyware are examples of malicious codes that have plagued computer systems throughout the world. Although there are technical differences between each type of malicious code, malicious codes are also collectively referred to as malware or “viruses.” Malware scanning or “antivirus” products for protecting computers against malicious codes are commercially available. Experienced computer users have installed some form of antivirus in their computers. A typical malware scanning (anti-malware) product includes a scan engine and a pattern file. The pattern file comprises patterns for identifying known malicious codes. To check a file for malicious code, the scan engine opens the file and compares its content to patterns in the pattern file. The pattern file needs to be updated to address newly discovered malicious codes. As the number of known malicious codes increases, so does the size of the pattern file. The larger the pattern file, the more memory and processing resources are consumed to perform malicious code scanning.
Spam refers to unsolicited bulk messages which are undesirable and a nuisance. Forms of spam include e-mail spam, instant messaging spam, mobile phone messaging spam, and others. Techniques to prevent or combat spam may referred to as anti-spam techniques.
Filtering of universal resource locators (URLs) may be used to provide a protective mechanism against web-based security threats. In addition, URL filtering may be used for workplace compliance or parental supervision of web browsing.