As more and more people rely on the wealth of online information, increased exposure on the Web may yield significant financial gains for individuals or organizations. Growing with the increasing significance of Web presence is the practice of Web spamming. In general, Web spamming may be broadly defined as actions intended to acquire or assert a presence on the web which is either undeserved by the webpage or undesired by the viewer. Web spamming is a serious problem for Web users because the users may not be aware of the spamming practice and tend to trust the result of a search based on a general reputation of the search engine used.
One significant category of Web spamming includes actions intended to mislead search engines into ranking some webpages higher than they deserved. Because rank promotion in the result of Web search engines can be gained by spamming techniques and can be translated to revenue or interest, web spam targeting search engines has become more and more widespread in today's web engineering, and has become one of the greatest challenges for search engines.
Although search engines use many techniques to combat with web spam, they can only remove the detected spam pages from a search results returned to a user who has performed a search using a search engine. Users still have many chances to come across spam pages even if they do not reach these pages through a search engine. For example, a user may come to a spam page by following a link contained in a previously visited webpage, or by following a link contained in an e-mail message or a document. The user may also directly go to a website by entering a URL in a web browser. Under such circumstances, existing anti-spam technologies centered around search engines may not be helpful to a user who is not accessing the website through a search engine.