1. Field of the Invention
The present invention relates generally to digital capture devices, and more particularly, to digital video encoders and other media capture devices.
2. Description of the Related Technology
Today's broadcast corporations, advertising agencies, consumer products and services companies, and other businesses have demanding media asset management needs. These organizations have been simultaneously empowered by the growth in tools and infrastructure for creating, storing and transporting media-rich files and challenged by the problem of managing the media assets that they've amassed and come to rely upon for their core businesses. The sheer volume of information available over the World Wide Web and corporate networks continues to accelerate. Because media assets are so crucial to these companies, they have an extreme need for an intelligent and efficient way to catalog, browse, search and manage their media assets. Prior attempts at a content management solution have yielded point solutions or proprietary applications. These applications have not leveraged the technologies already deployed by many organizations, such as industry-standard browsers and Web servers.
A system is needed that would automatically watch, listen to and read a video stream so as to intelligently extract information, termed metadata, about the content of the video stream in real-time. This information would become the foundation of a rich, frame-accurate index that would provide immediate, non-linear access to any segment of the video. Such a logging process would result in the transformation of an opaque video tape or file, with little more than a label or file name to describe it, into a highly leveragable asset available to an entire organization via the Internet. What was once a time consuming process to find the right piece of footage would be performed instantly and effortlessly by groups of users wishing to quickly and efficiently deploy video across a range of business processes. Television and film production, Web publishing, distance learning, media asset management and corporate communications would all benefit by such technology.
The distinction between still devices and motion devices is becomming blurred as many of these devices can perform both functions, or combine audio capture with still image capture. The capture of digital content is expanding rapidly due to the proliferation of digital still cameras, digital video cameras, and digital television broadcasts. Users of this equipment generally also use digital production and authoring equipment. Storing, retrieving, and manipulating the digital content represent a significant problem in these environments. The use of various forms of metadata (data about the digital content) has emerged as a way to organize the digital content in databases and other storage means such that a specific piece of content may be easily found and used.
Digital media asset management systems (DMMSs) from several vendors are being used to perform the storage and management function in digital production environments. Examples include Cinebase, WebWare, EDS/MediaVault, Thomson Teams, and others. Each of these systems exploit metadata to allow constrained searches for specific digital content. The metadata is generated during a logging process when the digital content is entered into the DMMS. Metadata generally falls into two broad categories:    Collateral metadata: information such as date, time, camera properties, and user labels or annotations, and so forth;    Content-based metadata: information extracted automatically by analyzing the audiovisual signal and extracting properties from it, such as keyframes, speech-to-text, speaker ID, visual properties, face identification/recognition, optical character recognition (OCR), and so forth.
Products such as the Virage VideoLogger perform the capture and logging of both of these types of metadata. The VideoLogger interfaces with the DMMS to provide the metadata to the storage system for later use in search and retrieval operations. These types of systems can operate with digital or analog sources of audiovisual content.
Digital encoding devices convert an analog video signal into a digital representation of the video, usually encoded with a compression algorithm. The analog signal can be any of NTSC, S-Video, PAL, or analog component video (Y, B-Y, R-Y). Typical encoders support one or more of these analog standards for input. Analog signals such as these are generated by tape decks, laser disks, satellite feeds, and analog cameras.
Examples of commercial digital encoding devices on the market today include:    MPEG encoders from Optibase, Minerva, Telemedia, Lucent, Innovacom, and others;    H.263 encoders for teleconferencing from vendors such as Picture-Tel;    RealVideo encoders from Real Networks operating in conjunction with video capture boards from Osprey (a ViewCast Corp. division);    Microsoft NetShow encoders operating in conjunction with a variety of basic video capture boards supported by the Windows platform.
Digital video encoders fall into two broad categories from an architectural standpoint:    1. Pure hardware encoders, typified by MPEG products based on MPEG chipsets, such as those from C-Cubed, Inc.    2. Combination hardware and software systems using basic capture hardware to digitize the analog signal, and then a software process to encode and compress the digital information. This is the typical architecture used by RealVideo and NetShow encoders. The software compression may be microcoded on a programmable capture board (such as the Osprey-1000) or may be running on a general purpose CPU (such as NetShow compression running on a Windows workstation).
For both types of digital video encoders, what is desired is the ability to extend the computations being performed to include metadata capture.