Individuals and organizations seek to protect their computing resources from attackers and malicious software. To achieve that goal, security vendors may attempt to identify software and categorize the software as malicious. These security vendors may index the different samples of software according to unique identifying features, such as hashes.
Unfortunately, attackers have responded to these security features by modifying the software, through polymorphism, in a manner that avoids detection (e.g., alters the hash), while preserving the software's malicious functionality. The security vendors may attempt to keep up with these modifications by manually coding each detected variant as malware. Nevertheless, the attempts by security vendors to manually code each detected variant as malware is inefficient, not comprehensive, and prone to human error. Accordingly, the instant disclosure identifies and addresses a need for additional and improved systems and methods for identifying variants of samples based on similarity analysis.