The present invention generally relates to detecting threats and malicious activity in computer networks.
Computer networks such as the Internet are increasingly ubiquitous. Given the importance and widespread use of computer networks, it is unsurprising that malicious actors try to carry out malicious acts over such computer networks. Such malicious acts can take a variety of forms, e.g. a cyber-attack such as a distributed denial of service attack, or a disinformation campaign over social media using bots. These malicious acts can be carried out directly by a human user, or indirectly, e.g. via a virus, bot, or other software.
Software is not inherently benign or malicious. There are many legitimate reasons for actions on a computer, and only intent differentiates between acceptable activity and a malicious act.
Conventional approaches to detecting malicious activity involve inferring intent via surrogates: detecting benign from malicious, normal from anomalous, good from bad. (See, e.g., P. Manadhata, S. Yadav, P. Rao and W. Horne, Detecting Malicious Domains via Graph Inference, in ESORICS, 2014; G. Jackson, Predicting Malicious Behavior: Tools and Techniques for Ensuring Global Security, 2012; Y. Chen and W. Chu, Database Security Protection via Inference Detection, 2006.)
However, some research has come to the conclusion that, for example, there is no functional difference between modern software such as the Chrome browser and advanced persistent threats. (T. Heberlein, The Advanced Persistent Threat You Have: Google Chrome, 2015.)
There have been identified a number of potential issues related to current approaches for detecting malicious activity.
One potential issue relates to detecting malware in the face of execution divergence, e.g. when malware acts differently depending upon whether it is in an analysis environment or on the target computer. This defeats defensive inspection in sandboxed or virtualized environments. (See, e.g., A. Dinaburg, P. Royal, M. Sharif and W. Lee, Ether: malware analysis via hardware virtualization extensions, in ACM Conference on Computer and Communications, 2008.)
Another potential issues relates to breaking software analysis systems. Frequently, offense actors will expend significant resources to analyze defensive systems, identify their weakness, and ruthlessly exploit them to evade detection. (See, e.g., A. Greenberg, How an entire nation became Russia's test lab for Cyberwar, 20 Jun. 2017. [Online]. Available: https://www.wired.com/story/russian-hackers-attack-ukraine. [Accessed 17 Aug. 2017].)
Another potential issue relates to malware acting where there is no analysis. One of the classic offensive tricks has been to bury malware deeper than where the analysis systems live such that their actions are unobservable, such as in the kernel, hypervisor, or even in the hardware peripherals. (See, e.g., Z. Cheng, Kernel Malware, February 2013. [Online]. Available: https://www.cs.bu.edu/˜goldbe/teaching/HW55813/zhou.pdf [Accessed 17 Aug. 2017]; K. Zetter, How the NSA's Firmware Hacking Works and Why it's so Unsettling, 22 Feb. 2015. [Online]. Available: https://www.wired.com/2015/02/nsa-firmware-hacking/. [Accessed 17 Aug. 2017].)
Another potential issue relates to malware's use of legitimate host communications application programming interfaces (APIs). One of the classic defensive tricks was to distinguish when software appeared to be something other than what it is. For example, a web browser using the OpenSSL crypto library for secure hypertext transfer protocol (HTTPS) traffic would immediately be identifiable as pretending to be a browser since Firefox uses libnss (see, e.g., Mozilla Foundation, Network Security Services, 1 May 2016. [Online]. Available: https://developer.mozilla.org/en-US/docs/Mozilla/Projects/NSS. [Accessed 17 Aug. 2017]), Chrome uses boringssl (see, e.g., Google, Boringssl, 17 Aug. 2017. [Online]. Available: https://boringssl.googlesource.com/boringssl/. [Accessed 17 Aug. 2017]), and the Edge/Internet Explorer family use the WinHTTP (see, e.g., Microsoft, SSL in WinHTTP, [Online]. Available: https://msdn.microsoft.com/en-us/library/windows/desktop
/aa384076%28v=vs. 85%29.aspx?f=255&MSPPError=−2147217396. [Accessed 17 Aug. 2017]) and SChannel library (see, e.g., Microsoft, Secure Channel, [Online]. Available: https://msdn.microsoft.com/en-us/library/windows/desktop/aa380123(v=vs.85).aspx. [Accessed 17 Aug. 2017]). The offense knows this and re-uses the operating system's own communications APIs so they are indistinguishable from native software. (See, e.g., D. Choi, The Stealthy Downloader, 28 Oct. 2013. [Online]. Available: https://blog.fortinet.com/2013/10/28/the-stealthy-downloader. [Accessed 17 Aug. 2017].)
Another potential issue relates to trusted third-party intermediaries for network communications. In 2017, Russian actors were publicly attributed to the use of English-looking Britney Spears Instagram discussion threads for malware command and control (C2). (See, e.g., M. Locklear, Russian malware link hid in a comment on Britney Spears' Instagram, 7 Jun. 2017. [Online]. Available: https://www.engadget.com/2017/06/07/russian-malware-hidden-britney-spears-instagram/. [Accessed 17 Aug. 2017].) Likewise malware C2 has used Facebook's APIs (see, e.g., M. Van Pelt, Reconnaissance and impersonation pay off for cyber criminals, 26 Jul. 2017. [Online]. Available: https://barracudalabs.com/2011/03/how-to-use-facebooks-opengraph-api-to-spread-malware/. [Accessed 17 Aug. 2017]), Gmail (see, e.g., A. Greenberg, Hackers are using gmail drafts to update their malware and steal data, 29 Oct. 2014. [Online]. Available: https://www.wired.com/2014/10/hackers-using-gmail-drafts-update-malware-steal-data/. [Accessed 17 Aug. 2017]), Twitter (see, e.g., B. Barth, Twitoor first Android malware known to leverage Twitter for command and control, 24 Aug. 2016. [Online]. Available: https://www.scmagazine.com/twitoor-first-android-malware-known-to-leverage-twitter-for-command-and-control/article/530184/. [Accessed 17 Aug. 2017]), Dropbox (see, e.g., FireEye Threat Intelligence, China-based Cyber Threat Group Uses Dropbox for Malware Communications and Targets Hong Kong Media Outlets, 1 Dec. 2015. [Online]. Available: https://www.fireeye.com/blog/threat-research/2015/11/china-based-threat.html. [Accessed 17 Aug. 2017]), and Skype (see, e.g., O. Sultan, Cyber Criminals Running Sophisticated Malware Campaign Via Skype, 11 Jun. 2016. [Online]. Available: https://www.hackread.com/skype-distributing-malware/. [Accessed 17 Aug. 2017]). The salient point is not just that the offense will hide in elusive spots, but that the offense is willing to constantly adapt to both the changing internet landscape as well as the defenses' ability to observe.
Another potential issue relates to the use of a unique hash per binary. Antivirus software, generically called Personal Security Products (PSPs), commonly calculates the cryptographic hash of files for comparison against known malicious or known benign lists. Generally, the latter do not undergo additional analysis. (See, e.g., S. McDonald, “Why AV is Dead, and what to do about it,” 4 Aug. 2015. [Online]. Available: https://www.herjavecgroup.com/why-av-is-dead-and-what-to-do-about-it/. [Accessed 17 Aug. 2017].) When building targeted implants, malware authors stamp the C2 information directly into the binary as well as a random number such that each deployment has a unique hash.
Another potential issue relates to hash collisions. As discussed above, PSPs will whitelist programs with a known hash. Recent advances in cryptographic hash breaks (see, e.g., T. Xie and D. Feng, Construct MD5 Collisions Using Just a Single Block of Message, 2010; A. Kuznetsov, An algorithm for MD5 single-block collision attacks using high-performance computing cluster; Y. Sasaki and A. Kazumaro, Finding Preimage in Full MD5 Faster Than Exhaustive Search, 2009; M. Mao, S. Chen and J. Xu, Construction of the Initial Structure for Preimage Attack of MD5, in International Conference on Computational Intelligence and Security, 2009) allow an adversary to create two files with the same hash: one benign to become whitelisted and one for more nefarious purposes. Even an empty file still has a hash value (MD5: d41d8cd98f00b204e9800998ecf8427e) which most security software whitelists. It is expected that, once the public cryptologic community is able, many top-tier threat actors will deploy malware that hashes to the same value as an empty file. The authors of the Flame malware are known to have hash collision attacks beyond the public state of the art. (See, e.g., D. Fisher, Microsoft Details Flame Hash-Collision Attack, 6 Jun. 2012. [Online]. Available: https://threatpost.com/microsoft-details-flame-hash-collision-attack-060612/76658/. [Accessed 17 Aug. 2017]; A. Sotirov, Analyzing the MD5 collision in Flame, June 2012. [Online]. Available: https://trailofbits.files.wordpress.com/2012/06/flame-md5.pdf. [Accessed 17 Aug. 2017].)
Another potential issue relates to analysis tarpits. The hashes of zero byte files, and other analytically useless indicators, are frequently associated with threat actors such as in Crowdstrike's FancyBear report on the Democratic National Committee compromise (see, e.g., Contagio, APT29 Russian APT including Fancy Bear, 31 Mar. 2017. [Online]. Available: http://contagiodump.blogspot.com/2017/03/part-ii-apt29-russian-apt-including.html. [Accessed 17 Aug. 2017]) and the US Computer Emergency Response Team's (US-CERT) Grizzly Steppe report which included TOR exit nodes as Indicators of Compromise (IOCs) (see, e.g., US-CERT, “GRIZZLY STEPPE—Russian Malicious Cyber Activity,” 6 Apr. 2017. [Online]. Available: https://www.us-cert.gov/security-publications/GRIZZLY-STEPPE-Russian-Malicious-Cyber-Activity. [Accessed 17 Aug. 2017]). These tarpits are an offensive asymmetric advantage in that their high false interconnectedness with other independent parties cause false linkages in graph analysis and unknown results when incorporated into machine learning algorithms.
Another potential issue relates to situations where professional offense doesn't crash a target. The academic literature on exploitation generally focuses on gaining program control. (See, e.g., R. Buchanan, R. Roemer, S. Savage and H. Shacham, Return-oriented Programming: Exploitation without Code Injection, in Blackhat, 2008; S. El-Sherei, Return-Oriented-Programming (ROP FTW), [Online]. Available: https://www.exploit-db.com/docs/28479.pdf. [Accessed 17 Aug. 2017]; A. One, Smashing the stack for fun and profit, in Phrack, 1996.) The reality of exploitation is that the more challenging problem is known in the art as “continuation of execution” to make sure the exploited target continues operating like normal such that there is no outward indication of compromise. The offense conducts significant operational fingerprinting of a target and adapts exploits and implants in real-time or in short order. Advanced threat groups are held to a 93-95% minimum success rate with the preponderance of failures expected to be detected when fingerprinting a target so that a failing exploit or implant is never even attempted.
Another potential issue relates to pervasive encryption. All that is required to exist unencrypted is the functionality to receive and load encrypted payloads. If Internet connectivity is assumed, even the payload decryption key can be stored remotely. Everything else can be sent over the network, loaded directly into memory, and be destroyed after use: data gathering, effects, spreading, et al. (See, e.g., M. Polychronakis and M. Meier, Detection of Intrusions and Malware, and Vulnerability Assessment, Springer, 2017.) For instance, the Gauss encrypted modules have evaded analysis since 2012. (See, e.g., K. Zetter, Researchers seek help cracking Gauss mystery payload, 14 Aug. 2012. [Online]. Available: https://www.wired.com/2012/08/gauss-mystery-payload. [Accessed 17 Aug. 2017].)
Needs exist for improvement in detecting threats and malicious activity in computer networks. These and other needs are addressed by one or more aspects of the present invention.