Existing techniques for detecting the presence of unauthorized programs are typically resource-intensive. For example, they generally require constant updates (e.g., of blacklists) and periodic or continuous scans for problems. The situation is exacerbated if the device being protected by such techniques has limited resources, such as limited memory, or by being powered by a battery. As one example, a device with limited resources may not be able to store definitions for detecting all known unauthorized programs. As another example, scanning for unauthorized programs is typically a power-intensive act, and may quickly deplete the battery of a battery-powered device. In some environments, a central authority is used to facilitate the discovery of unauthorized programs. One drawback of this approach is that it typically requires that the device being protected compile detailed logs of device activities. Generating such logs is resource-intensive (e.g., requiring large amounts of disk storage; processing power to assemble the log data; and the bandwidth to deliver the log data to the central authority) and can also present privacy problems.
Existing techniques for detecting the presence of unauthorized programs are also generally vulnerable to attempts by such programs to cause incorrect reporting. For example, a rootkit can “listen in” to requests by applications to the operating system, and may modify these requests and their responses. If an application requests information about what processes are running, a malicious rootkit application can avoid detection by removing information about itself from the report that is returned by the operating system.
Existing techniques for screening against the installation or execution of unauthorized programs are also known to be vulnerable to new instances of malware that may not immediately be detectable due to a lack of information about their structure and functionality. Therefore, and irrespective of the resources available to the device, if the unauthorized program is sufficiently sophisticated and/or has not previously been encountered, it can evade detection and cause undetected harm. And, if the unauthorized program has intentionally been installed by the user to bypass detection (e.g., to facilitate software piracy), traditional techniques may fail to locate the unauthorized program, or any other unauthorized activities.
The generation of knowledge about the structure and functionality of malware, and the associated generation of countermeasures, is traditionally a labor-intensive effort, and there is a long-felt need to increase the degree of automation of this process, e.g., by automatic determination of what machines exhibit anomalous behavior, and the apparent requirement associated with malware causing such anomalous behavior.
The ability of traditional tools to project the spread patterns of malware is limited and not automated. That is a great drawback, since an ability to automatically generate trend prognoses for malware spreads permits more effective distribution of countermeasures, given a limitation of resources such as the generation of malware antidotes and the communication of these to a large number of networked devices whose need for the antidotes may differ greatly.
It is therefore the object of the invention to provide a server-side system that detects and classifies malware and other types of undesirable processes and events operating on network connected devices through the analysis of information collected from said network connected devices. The system receives information over a network connection and collects information that is identified as being anomalous. The collected information is analyzed by system process that can group data based on optimally suited cluster analysis methods. Upon clustering the information, the system can correlate an anomalous event to device status, interaction, and various elements that constitute environmental data in order to identify a pattern of behavior associated with a known or unknown strain of malware. The system further interprets the clustered information to extrapolate propagation characteristics of the strain of malware and determine a potential response action.