This disclosure relates to identifying malicious activity.
Malicious code typically attempts to exploit security loopholes on various devices connected to the Internet in an effort to replicate the code, steal information, gather information for an attack, etc. In order to identify vulnerable machines, such malicious code can select a target Internet protocol (IP) address randomly or might scan through one or more ranges of IP addresses.
The malicious code can issue connection requests using various protocols in an attempt to identify the particular services provided by the target IP address. In such situations, the malicious code can identify services provided by a target IP address and attempt to exploit the services using various attack techniques specific to the service and version thereof that the malicious code is attempting to exploit.
Using such algorithms in an attempt to gain access to networks, the malicious code often attempts to access a “darknet.” Darknets can be defined as those IP addresses which are either unassigned or unused. Such darknets typically only receive traffic for one of three reasons: accident/mistake, backscatter, and malicious scanning.
Accidental requests typically only result in a small percentage of requests to darknets. Humans generally leverage uniform resource locators (URLs) as opposed to IP addresses when accessing web based resources. Therefore, accidental darknet requests would only occur if a human used and mistyped an IP address or the URL used had an incorrect Domain Name Service (DNS) entry, which pointed to a darknet.
Backscatter generally indicates malicious activity using various spoofing techniques (an attempt to obscure the source of a request). For example, a denial of service attack could be conducted from a randomly spoofed address using Transmission Control Protocol (TCP). When the host at the targeted IP address responds to the initial request in an attempt to perform a three-way handshake, the acknowledgement sent to the host at the spoofed source IP address would be deemed backscatter. However, backscatter would be expected to constitute a small portion of overall traffic, as it is limited to attacks such as a denial of service, which would not require a reliable communication channel to be established. Thus, spoofed source addresses tend to be utilized in attacks leveraging unreliable protocols such as UDP or ICMP, which would not generate backscatter.
Automated scanning, such as used by malicious code, provides the origin of the majority of traffic to darknets. The malicious code responsible for such scanning would commonly randomly select target IP addresses and address ranges for scanning. Such scanning typically does not attempt to avoid accessing darknet addresses.
In an attempt to identify potentially malicious activity, researchers accumulate darknet address ranges and deploy one or more machines assigned darknet addresses to collect information destined for the darknet. However, such solutions are often impractical based upon the expenditure required to obtain and manage large blocks of darknets. Moreover, because IP addresses are a finite resource, efforts have been made to discourage the non-use of large blocks of IP addresses. For example, the American Registry for Internet Numbers (ARIN) has drafted a proposal to require entities with IP address blocks larger than a specified size to have >50% utilization or risk having all or a portion of their block rescinded.