As the ubiquity and importance of digitally stored data continues to rise, the importance of keeping that data secure rises accordingly. While companies and individuals seek to protect their data, other individuals, organizations, and corporations seek to exploit security holes in order to access that data and/or wreak havoc on the computer systems themselves. Generally the different types of software that seek to exploit security holes can be termed “malware,” and may be categorized into groups including viruses, worms, adware, spyware, and others.
Many different products have attempted to protect computer systems and their associated data from attack by malware. One such approach is the use of anti-malware programs such as McAfee AntiVirus, McAfee Internet Security, and McAfee Total Protection. Some anti-malware programs rely on the use of malware signatures for detection. These signatures may be based on the identity of previously identified malware or on some hash of the malware file or other structural identifier.
This approach, however, relies on constant effort to identify malware computer files only after they have caused damage. Many approaches do not take a predictive or proactive approaches in attempting to identify whether a computer file of unknown content may be related to a computer file of known content or to a category of computer files.
Additionally, the difficulties in identifying whether a computer file of unknown content is related to a computer file of known content or belongs in a category of computer files is not limited to malware. Other types of information security may depend on identifying whether an accused theft is actually related to an original computer file, a daunting proposition for assets such as source code that may range for hundreds of thousands of lines.