1. Field of the Invention
The present invention generally relates to the field of accessing and processing digital video on a network such as the Internet. More particularly, the invention relates to innovative techniques to solve the problem of finding video content on the Internet.
2. Description of the Related Technology
A number of techniques have evolved in recent years as the Internet has grown in size and sophistication, including:                The use of web servers and HTML delivery to web browsers.        The use of the application-server model for connecting database information with web pages and interactive interfaces for end users.        The use of dynamically generated HTML that pulls information from a database to dynamically format HTML for delivery to the end user.        The use of a template language to merge database output with pre-formatted HTML presentations.        The use of ‘cookies’ to track individual user preferences as they interact with the web pages and applications.These and other related web technologies and techniques are in commonplace use and readily accessible on the Internet.        
In addition to the technologies described above, video indexing technology has also emerged, herein referred to as ‘video logging’. Video logging is a process that incorporates both automated indexing and manual annotation facilities to create a rich, fine-grained (in a temporal sense) index into a body of video content. The index typically consists of a combination of visual and textual indices that permit time-based searching of video content. The index may incorporate spoken text, speaker identifications, facial identifications, on-screen text, and additional annotations, keywords, and descriptions that may be applied by a human user executing the video logging application. The Virage VideoLogger® is one example of this type of video logging technology that is commercially available.
The delivery of coded media on the Internet requires the encoding of video content into one or more coding video formats and efficient delivery of that content to the end users. Common coding formats presently in use include RealVideo, Microsoft Windows Media, QuickTime, and MPEG. The video logging technology may help to orchestrate the encoding of one or more of these formats while the video is being indexed to ensure that the video index is time-synchronized with the encoded content. The final delivery of coded media content to an end user is typically accomplished with a wide variety of video serving mechanisms and infrastructure. These mechanisms may include basic video servers (such as those from Real, Microsoft, and Apple), caching appliances (such as those from CacheFlow, Network Appliance, Inktomi, and Cisco), and content delivery networks (herein “CDN's”, such as those from Akamai, Digital Island, iBeam, and Adero). These types of video serving mechanisms deliver media content to the end user.
Coded media such as video, Flash™, SMIL, and similar formats (collectively referred to as ‘video’) is available on the World Wide Web in large quantities. Video content is available ‘on demand’ from archives, and is ‘webcast’ in a live manner similar to broadcasts. While there some efforts to provide a “TV Guide” for the live webcasted video (such as Yack and ChannelSeek), there are unfortunately very few indexes of archived video content. The only ones that exist are highly localized (they only index one site). End users have no central search and access mechanism like those that exist for web-based text content using traditional search engines. Moreover, the content is rapidly changing and growing, and this makes it impossible for individuals remain abreast of the content available at any given time.
What would be desired is the ability to automatically discover and index video content existing on web pages. This discovery and indexing process is called ‘web crawling’ or ‘spidering’. The fundamental concept of spidering is to traverse a set of hyperlinked documents (web pages) by following the hyperlinks from one page to the next. Existing spidering technologies are intended to generate an index of the text content found on the pages by parsing the HTML. However, web pages contain many more forms of content other than text. They also contain rich media such as images, video, and animated graphics (i.e., SMIL, Flash or Shockwave presentations). These types of content are embedded in HTML statements or sophisticated blocks of scripting language (such as JavaScript or VBscript). Existing spiders identify these types of content and skip over them. It would be advantageous to locate and identify rich content in order to index it.
Identifying a video URL for indexing may be fairly easy in some cases if the video content is a simple file linked in a basic HTML “HREF” statement. However, most video content is exposed on web pages in a more complex manner using scripting languages and meta-container files (like “.asx” and “.ram”) to make the presentation of the video interactive, to specify a play-list of individual videos, or to offer multiple choices of bit-rates or formats. Thus, the URL for the content is not explicit, but must be evaluated by executing the scripting language or parsing the container file in a similar way as would a web browser application. Even then, it is necessary to identify the multiple versions of a piece of content so that it is only indexed one time. Thus, it would be desirable to parse out blocks of script and execute it, and also to use the context of the script, video URLs, and surrounding HTML to group versions (varying by bit-rate and/or coding format) of the same content together.