1. Field of the Invention
The present invention relates generally to identifying content provided over a network. In particular, the present invention is directed toward determining a type of content embedded within a page of the Word Wide Web.
2. Description of the Related Art
Content viewed over the World Wide Web often involves more than simply plain text. Today's web surfers are able to listen to music, view movies and perform various animation tasks, bank online, and play games. In some instances, surfers view this content by following a link directly to the content. Perhaps more commonly, the content is embedded within a web page provided by a web server to a web client, and referenced using HTML tags. These embedding tags, such as the <embed> and <object> tags, inform the web client about the type of content that is embedded. This typically signals the web client to use a particular plug-in application in order to display the content. In the case of an <embed> tag, the content type is specified by use of a MIME type, which is typically associated at the client side with a particular application that handles that type of content. In the case of an <object> tag, a class ID is typically also provided. The class ID typically uniquely identifies a particular version of a particular application that should be used to play the object.
Because the plug-in application is chosen according to the tag, it is possible that the plug-in specified will not be the plug-in most appropriate for the content to be viewed. This might happen, for example, due to programmer error, content revisions that are not correctly propagated to all documents, etc. Under these circumstances, the content is not viewable, and typically the end user is provided with an error message, or undecipherable characters.
Conventional methods exist for determining the content type of a web page returned by the server. For example, in Microsoft's Internet Explorer, MIME type determination occurs through a FindMimeFromData method that contains hard-coded tests for a variety of MIME types. The method scans through the buffer contents and identifies a MIME type that is either known, unknown or ambiguous. Although the method can be used for determining the content type of a whole page, it does not address the problem of identifying the type of content embedded within a page.
Accordingly, there is a need for a system and method for more reliably identifying types of content received over the World Wide Web.