1. Field of Invention
This invention relates to real-time storyboarding using a graphical user interface to automatically parse a video data signal and browse within the parsed video data signal. Specifically, this invention is directed toward systems and methods that generate a real-time storyboard on a distributed network, such as the World Wide Web (WWW), and a graphical user interface tool for fast video analysis of both compressed and uncompressed video images for automatic parsing and browsing.
2. Description of Related Art
A xe2x80x9cdocumentxe2x80x9d is no longer merely a conventional paper product. Rather, a xe2x80x9cdocumentxe2x80x9d now encompasses electronic multimedia files which can include audio, video and animations, in addition to text and images. Nevertheless, people still prefer to print or have a hard copy of the multimedia document for various reasons, including portability and ease of reading. For space-dependent information, such as text and images, printing is easy.
Video is becoming an important element in many applications, such as multimedia, news broadcasting, video conferencing and education. A plethora of scholars, including political scientists, physicians and historians, study video or multimedia documents as a primary source of educational or research material. By using traditional techniques, such as video recorders, one is able to view the material of interest, or fast forward and/or rewind to sections deemed important. However, since video content is generally extremely vague, multimedia and video cannot be handled as efficiently as text. For example, most multimedia and video application systems rely on interactive user input to compile the necessary representative static data.
However, to easily scan the content of a document containing audio/video or animations, or print portions of the document containing audio/video or animations, the dynamic information must first be converted into a static counterpart. By performing a real-time dynamic-to-static conversion on the video or multimedia document, the methods and systems of this invention enable printing and/or viewing through a distributed network, such as the World Wide Web (WWW), whether or not the original source contains command information pertaining to the significant or representative frames of the document. The command information which is embedded during production specifically indicates that one or more frames is representative of a particular segment of the document.
In one example, a corporation desires to show a video to its employees that contains the chief executive officer""s report of the previous quarter, questions and answers and some of the company""s new products. Traditionally, this is achieved by collocating the employees in a conference room and showing them the video, or performing a multicast throughout the company. Another way to show the report would be to convert the video into a format which can be displayed as a video on an intranet or the Internet, such as in a web page, thereby allowing employees to view it at their discretion. However, this would require tremendous bandwidth and storage capabilities.
Alternatively, by processing the video or multimedia document, the systems and methods of this invention summarize the original video, i.e., the dynamic information, by placing representative static images, and if appropriate, associated text, into a web document for viewing. This overcomes the storage and bandwidth problems previously mentioned, as well as solves the problem of scanning or printing a dynamic document. Since the dynamic media is converted into static media before being presented, the static media can then be printed during a presentation using commonly used and known techniques.
Once a video or multimedia document has been disassembled into key frames and placed on a distributed network or into a web document, a user is able to further browse the details of each segment represented by the key frame.
This invention provides systems and methods for real-time storyboarding on a distributed network.
This invention separately provides a graphical user interface that allows both automatic parsing and browsing of video sequences from the key frames.
This invention separately provides methods and systems for automatic video parsing of a video and/or for browsing through the, video using a graphical user interface.
This invention separately provides for real-time dynamic-to-static conversion of video documents.
This invention also provides systems and methods that allow for printing and/or viewing static documents through a distributed network, such as the World Wide Web, when the original source is a video or multimedia document.
This invention separately provides systems and methods that reduce the dependency on humans to create visual aids representing meaningful segments of a video or multimedia document.
This invention separately provides systems and methods that eliminate required interactive components for translating a parsed incoming video data signal into meaningful segments.
By using statistical methods based on frame and histogram differencing, key frames can be extracted. The extracted key frames associated with each segment can then be used for fast browsing or for retrieving the actual video or multimedia clip represented by that key frame. For example, a first image, e.g., captured frame, of a segment could be shown. Through a graphical user interface, the user could elect to play the remainder of the segment, or skip forward to the next significant, or key, frame.