Electronic publishing, and the provision of access to content, has been one of the driving forces behind the explosive growth of the Internet. Two examples of such electronic publishing, and data access, include Internet-based commerce listings (e.g., classified advertisements, online auctions), which allow users to publish information regarding products and services for sale, and web-based e-mail (e.g., HOTMAIL™ and YAHOO! MAIL) that allow people to send electronic communications to other users.
In order to increase the richness of the presentation of information accessible, and communicated, via the Internet, a number of descriptor languages have emerged to support the authoring of content. The most prominent of these are the so-called descriptor formats (e.g., HyperText Markup Language (HTML), eXtensible Markup Language (XML), etc.). These markup languages allow active content to be included within published content or communicated data to be rendered by a browser.
While active content has the potential to enrich the Internet experience, it also presents a number of security problems and vulnerabilities. For example, unscrupulous and malicious users are able to include malicious data (e.g., content) within active content of a web page. Such malicious data may, for example, take the form of a virus that infects the computer system of a user on which a web page is rendered or code that harvests private user information. The combating of “malicious” data presents significant technical challenges to the operators of web-based services. For example, a web-based e-mail service provider may be challenged to exclude malicious data from e-mail communications. Similarly, the operator of a web-based commerce system may be challenged to ensure that listings, available from the commerce service provider's web site, do not contain malicious data. The technical challenges increase as the volume of communications processed by a particular web site increase.