To analyze a program and determine its malicious or benign nature, it is necessary to obtain a copy (i.e., a sample) of the program. This is often a problem, even on established platforms (like Windows) due to the large (and growing) size of software (both clean and dirty). It can be an even bigger problem for mobile or embedded devices that have limited resources (e.g., CPU, battery, bandwidth), intermittent connectivity, and also where lengthy transmissions can hurt user experience and/or cause increased carrier costs for the user.
There are currently documented solutions to submit “slices” of programs from multiple devices (e.g., in a peer-to-peer (P2P) fashion) to a server dictionary for analysis, but they only work if a program is sufficiently common so that the “submission load” can be distributed between multiple clients.
With the proliferation of targeted malware (and, more generally, with the scale of attacks shrinking, where fewer targets are affected by the same piece of malware), there is an increasing need to submit unique samples in full and quickly while minimizing the bandwidth consumption.
Submitting an entire sample in full from a mobile device with intermittent connectivity (or from an embedded device over a temporary link, e.g., in a “data milling” scenario) may often be problematic. For example, disconnection in the middle of a transmission (e.g., when the data mule leaves the range of the access point) may end up invalidating the entire sample submission. Thus, it would be valuable to have a more efficient scheme for submitting sample information for analysis at a server site.
The subject matter of the present disclosure is directed to overcoming, or at least reducing the effects of, one or more of the problems set forth above. To address these and other issues, techniques that, in part, use adaptive and/or recursive filters to intelligently determine portions of samples to send to a server for analysis are described herein.