CPC G06F 21/54 (2013.01) [G06F 21/566 (2013.01); G06F 2221/033 (2013.01)] | 20 Claims |
1. A computer-implemented method for programmatically identifying executable code within a file, the method comprising:
accessing, by a computer system, a sequence of bytes from a portion of the file;
extracting, by the computer system, from the sequence of bytes, a number of n-grams, wherein each n-gram comprises a contiguous series of bytes in the sequence of bytes, and wherein the contiguous series of bytes of each respective n-gram comprises n number of bytes;
generating, by the computer system, an array of counters, each counter of the array associated with one of the n-grams, wherein each counter comprises an integer value based on a frequency of occurrence of the associated n-gram within the sequence of bytes; and
applying, by the computer system a predictive model to the array of counters to determine a probability that the sequence of bytes comprises executable code, wherein the computer system comprises a computer processor and an electronic storage medium.
|