This disclosure generally relates to advertisements presented by an online system, and particularly to identifying malicious content in advertisements that may potentially be presented by the online system.
An online system allows its users to connect to and interact with other online system users and with objects on the online system. The online system may also present advertisements to its users. Presenting advertisements allows the online system to obtain revenue from advertisers, while allowing the advertisers to present advertisements for products or services to online system users.
However, certain advertisements provided to an online system for presentation may include malicious content included in the advertisements by an advertiser or by another entity. To protect its users, an online system often uses one or more methods to identify advertisements including malicious or potentially malicious content and to prevent the identified advertisements from being presents to online system users. Conventional methods for identifying malicious content in an advertisement entail manually reviewing an advertisement's content to determine if the advertisement includes malicious text or content or analyzing an advertisement's content using one or more automated systems to identify misspellings or grammatical errors in the advertisement content text to determine if the advertisement includes malicious content. However, reviewing large volumes of advertisements using conventional methods may be cumbersome and inefficient. Further, malicious advertisers have developed methods for circumventing conventional automated systems by using characters from different Unicode blocks or ranges to generate grammatically correct text in an advertisement.