Web search has become a powerful and indispensable means for people to obtain information today. However, by crafting specific search queries, hackers may obtain very specific information from search engines that may reveal the existence and locations of security flaws such as misconfigured servers, password files, and vulnerable software. For example, carefully crafted search queries can be used by attackers to gather information such as email addresses or password files or information about misconfigured or even vulnerable servers. As such, the amount of malicious Web search traffic has been increasing. Search bots are submitting malicious searches to identify victims for spreading infections, supporting phishing attacks, determining compromised legitimate domains, spamming, and launching Denial of Service (DoS) attacks. Some of these search bots are stealthy.
Although there are approaches to detect attacks, their correlations to Web search are not well understood. Determining the correlations is a challenging task because search logs contain massive amounts of mixed data from normal users and attackers, and because most of the malicious queries used by attackers are previously unknown and can change frequently.