Embodiments described herein relate generally to efficiently processing string data structures, and more particularly to methods and apparatus for detecting whether a string of characters represents malicious activity (e.g., using machine learning).
In some known systems, string data structures can provide insight as to whether or not an artifact is malicious. For example, some known systems can process a string to predict whether or not an artifact is malicious. Such known systems, however, typically have difficulty determining whether relatively short strings relate to a malicious artifact. Additionally, such known systems can require the use of multiple models, each corresponding to a different analysis of the string, to determine whether or not the string indicates that the artifact is malicious. Further, an analyst typically specifies what would indicate, in the string file, that the artifact is malicious. As such, such known systems may not have the ability to learn malicious characteristics of artifacts.
Accordingly, a need exists for methods and apparatus that can process strings related to artifacts, without the use of multiple resource-intensive models, and without manual coding of malicious indicators.