1. Field of the Invention
The present invention relates to a system and method for navigating through source content. The source content preferably takes the form or video and/or audio material.
2. Description of the Prior Art
The reviewing of source content, for example for editing or archival purposes, is traditionally a labour intensive task. Typically, the source content will be loaded into a device for reproducing the source content, for example a video recorder or a disk player and a user will then interact with that device via an editing or archival system, which would typically be a computer-based system.
Prior to editing the source content, or preparing the source content for archival, the user typically has to spend a significant amount of time reviewing the source content to familiarise himself/herself with the material to be edited or archived (including learning what material there is, and where that material is located within the source content), and the time spent performing this initial familiarisation with source content clearly impacts on the efficiency of the editing or archival process. During this initial familiarisation step, and indeed during subsequent steps, the user may wish to move around between particular parts of the source content, and it has been found that significant time is spent in accurately and efficiently finding particular features within the source content.
Viewed from a first aspect, the present invention provides a system for navigating through source content to identify a desired feature within the source content, semantic source metadata being associated with portions of the source content, the system comprising: an input interface for receiving a request for the desired feature; a processing unit for generating from the request a search request specifying the desired feature with reference to semantic metadata, and for causing the search request to be processed to identify semantic source metadata indicative of the desired feature; and an output interface for outputting as the desired feature a representation of the portion of the source content associated with the identified semantic source metadata.
There is currently much interest in the generation of metadata, metadata being data which describes content. Indeed many standards are being developed for metadata, for example by SMPTE-EBU (Society of Motion Picture and Television Engineers-European Broadcast Union) and by MPEG-7 (Motion Pictures Expert Group which is an ISO/TEC standards body SG29/WG11).
A certain amount of metadata can be added automatically at source, for example good shot markers, Rec marks (indicating where recording starts/stops), GPS location, Time and Date, UMID (Unique Material Identifier), Camera settings, focus, zoom, etc. Further metadata can be manually associated with the source content after it has been created, for example Cameraman, Journalist Notes, Style comments, Suggestions, Annotations, Location/Assignment comments, Shot identification such as Intro, Finish Commentary, Voice Over, etc. In addition, there is much development in progress directed to the automatic extraction of metadata from the content, for example by using recognition tools, for example face and feature detection tools, speech recognition tools, etc., to identify features within the content, and thereby enable appropriate metadata to be added.
It will be appreciated that at least some of the above-mentioned examples of metadata provide contextual/descriptive information about the actual content of the source content, such metadata being referred to herein as xe2x80x9csemantic metadataxe2x80x9d. Examples of semantic metadata would be shot identification information, good shot markers, an xe2x80x9cinterview questionxe2x80x9d identifier to identify the start of interview questions, and any metadata used to identify face shots, speech, etc.
The present invention takes advantage of the current proliferation in semantic metadata associated with source content (hereafter referred to as semantic source metadata) to enable more efficient and accurate navigation through source content to locate features of interest to the user.
More particularly, in accordance with the present invention, the system is arranged to receive a request for a desired feature within the source content. The desired feature may be couched in absolute terms, for example the request may be for the start of speech, for the portion of the source content marked as an introduction, etc. Alternatively, the request for the desired feature may be couched in relative terms, for example a request for the next face, next transition, etc.
The processing unit in accordance with the present invention is arranged to generate from the request a search request specifying the desired feature with reference to semantic metadata. Hence, the search request will typically identify one or more items of semantic metadata which might be used to locate the desired feature requested. For example, if the request was for a particular interview question such as xe2x80x9cinterview question 2xe2x80x9d, appropriate semantic metadata would be interview question flags used to identity the beginning of particular interview questions within the source content. Similarly, if the request is for the next face, then appropriate semantic metadata would be metadata identifying a portion of the source content as containing a face.
Once the search request has been generated, the processing unit is arranged to cause the search request to be processed to identify semantic source metadata within the source content indicative of the desired feature. This will typically involve a search through the semantic source metadata to identify semantic source metadata matching that specified in the search request. Once the search request has been processed, the system of the present invention is arranged to output as the desired feature a representation of the portion of the source content associated with the semantic source metadata identified during processing of the search request.
The semantic source metadata may be stored on a variety of different storage media, and may be stored with, or separately to, the source content. In a first embodiment of the present invention, the semantic source metadata is stored in a randomly-accessible storage medium, and the processing unit is arranged to cause the search request to be applied to the semantic source metadata as stored in the randomly-accessible storage medium. By storing the semantic source metadata in a randomly-accessible storage medium, it is possible to search through the semantic source metadata in a non-linear fashion, which assists in increasing the speed of the search process.
In preferred embodiments, the source content comprises video material, and at least some of the semantic source metadata identity video features within the source content. In such embodiments, an iconic representation of the video features may be stored with the associated semantic source metadata, in which event the representation output by the output interface can be arranged to comprise the iconic representation associated with the semantic source metadata identified during processing of the search request. By this approach, an indication of the portion of the source content identified as a result of the request for the desired feature can be output to a user without having to scan through the source content to retrieve that portion of source content. This can significantly increase the speed of the navigation process, particularly when the source content is stored in a linear format, for example on a digital tape.
Further, in preferred embodiments, the source content comprises audio material, and at least some of the semantic source metadata identify audio features within the source content. As with video material, an iconic representation of the audio features may be stored with the associated semantic source metadata and the representation output by the output interface as a result of the processing of the search request may comprise the iconic representation associated with the identified semantic source metadata. As an example, an iconic representation of the audio may be provided by a wave form picture, which to the trained eye provides some information about the content of the audio material.
As mentioned earlier, the semantic source metadata may be stored with the source content. However, in a first embodiment, the semantic source metadata is stored separately to the source content, and is associated with the source content via time codes.
In an alternative embodiment, the semantic source metadata is stored with the source content on a storage medium, and when the processing unit causes the search request to be processed to identify semantic source metadata indicative of the desired feature, the output interface is arranged to output the associated portion of the source content as the desired feature. In this embodiment, when applying the search request to the semantic source metadata, the system will automatically scan through both the semantic source metadata and the associated source content, and accordingly will readily be able to output the associated portion of the source content to indicate the result of the search request.
In preferred embodiments, the source content to be navigated through already contains appropriate semantic source metadata to facilitate effective navigation. However, it is also possible that the system may be connected to appropriate recognition tools to enable generation of relevant semantic source metadata xe2x80x9con the flyxe2x80x9d. Accordingly, in one embodiment, dependent on the search request, the processing unit is arranged to access a predetermined recognition tool to cause the recognition tool to review the source content and to generate semantic source metadata required to process the search request. Accordingly, as an example, if the request received was for the next face, a face recognition tool may be activated to scan through the source content from the current location to generate semantic source metadata for the location in the source content where the next face appears. Here, it is apparent that the result of the search is identified at the same time as the semantic source metadata is generated.
In a first preferred embodiment, the input interface is connectable to a drawing tool, and the request for the desired feature is entered by a user via the drawing tool. It will be appreciated that the desired features could be identified by defining appropriate gestures to be entered via the drawing tool. In an alternative embodiment, the input interface is connectable to a jog-shuttle device, and the request for the desired feature is entered by a user via the jog-shuttle device. For example, individual buttons on the jog-shuttle device may be used to identify features such as face, transition, speech, and then the jog-shuttle element can be used to specify relative information, such as xe2x80x9cnextxe2x80x9d or xe2x80x9cpreviousxe2x80x9d.
In preferred embodiments, the system for navigating through source content is embodied in an editing system. However, it will be appreciated that the system could also be used in other environments, for example archiving systems.
Viewed from a second aspect, the present invention provides a method of operating a system to navigate through source content to identify a desired feature within the source content, semantic source metadata being associated with portions of the source content, the method comprising the steps of: (a) receiving a request for the desired feature; (b) generating from the request a search request specifying the desired feature with reference to semantic metadata; (c) processing the search request to identify semantic source metadata indicative of the desired feature: and (d) outputting as the desired feature a representation of the portion of the source content associated with the identified semantic source metadata.
Viewed from a third aspect, the present invention provides a computer program for operating a system to navigate through source content to identify a desired feature within the source content, semantic source metadata being associated with portions of the source content, the computer program being configured in operation to cause the system to perform the steps of: (a) generating from a request for the desired feature a search request specifying the desired feature with reference to semantic metadata; (b) causing the search request to be processed to identify semantic source metadata indicative of the desired feature; and (c) retrieving a representation of the portion of the source content associated with the identified semantic source metadata to be output as the desired feature.