(1) Field of Invention
The present invention relates to a cyber security system and, more particularly, to a cyber security system capable of automatic, fast, and accurate identification of cyber attackers and their weapons from evidence left behind through forensic analysis of digital artifacts.
(2) Description of Related Art
Cyber security has become an increasingly important issue in the modern digital world. One important aspect to address the problem of cyber security is to identify and infer connections between objects involved in attacks (e.g. suspicious events, resources, victims, suspects), and to manage dynamic information for use in predicting intents and actions forensically. Modeling and analyzing such relations has recently emerged in the field of network science, data mining and malware analysis. The sheer volume of raw audit data generated by conventional system and network monitors almost precludes the deployment of intelligent sensors that not only have programmed knowledge models for the systems they are monitoring, but also some method of learning new models to adapt to changes in context or observed attack vectors. Moreover, attacker attribution, the process of establishing concrete relationships between suspects and the evidence they leave behind, is extremely difficult. While attacks may share significant similarities, it may be difficult to correlate a known (and attributed) attack to a newly observed attack based, for example, solely on comparison of temporally ordered events.
Review of evidence left behind is a form of forensic analysis that is typically associated with criminal actions. Such actions can be applied to cyber security analysis using phylogenomic theories. In other words, digital artifacts, like the humans that create them, contain uniquely identifiable internal codes (genes) and externally visible traits (phenes) that provide clues as to who or what created them (provenance), how they evolve over time (heredity), and how they are related (lineage). Although much of the work on phylogenomics has occurred in the biological domain, a handful of approaches have begun applying that to computer malware.
By way of example, Valverde and Zimmerman have adopted motifs for software analysis and fault prediction, however they were based on relations extracted from the source code and have not been applied to malware or other binary artifacts (see the List of Cited Literature References, Literature Reference Nos. 15 and 18).
Other examples were provided by Carrera, Ghorghescu, and Karim, whom describe that multiple feature relations can be collapsed into a single distance measure to generate binary trees, but they did not explicitly model malware multiple inheritance (see Literature Reference Nos. 4, 7, and 10). Further, although Goldberg demonstrated a primate method of modeling multiple inheritances, there was no demonstration of practical results (see Literature Reference No. 8).
While the aforementioned examples touched on phylogenomic theories, they each exhibit limitations and provide significant gaps that prevent them from being complete solutions. Specifically, the prior art fails to: apply the motif identification and analysis techniques with a star-net schema to process raw features extracted from digital artifacts; identify hierarchical feature distributions from artifact relation network to establish lineage relations; determine heredity by finding shared feature sets using artifact relation network and evolution models; and inferring provenance relations by probabilistic graphical analysis techniques that treat author and development environments as missing/hidden values.
Thus, a continuing need exists for a cyber security system that bridges the aforementioned gaps by providing a complete cyber security system capable of automatic, fast, and accurate identification of cyber attackers and their weapons from evidence left behind through forensic analysis of digital artifacts