1. Field of Invention
This invention relates to a method and apparatus for reviewing an aural and/or visual and/or other representation of a media file. Specifically, the invention relates to using media content features to allow a user to more easily review a media file.
2. Description of Related Art
Text documents often have many cues, such as headings, paragraphs, punctuation, etc., that allow a reader to quickly determine the beginning and end of different sections of the document and to aid the reader in finding areas of interest. However, video and audio browsing systems typically do not provide information to the user regarding simple features, like section beginning and end points, much less more complicated information, like the name of the speaker on a video clip. Such browsing systems typically offer only standard xe2x80x9cVCR-typexe2x80x9d playback control options, like play, stop, rewind, and fast forward. As anyone who has tried to find a specific video clip on a conventional video tape using a standard VCR will understand, it is often difficult to locate portions of interest in a video using the standard playback controls.
Many techniques exist for extracting information that represents the feature content of a media file. In this application, the term media or media file is used to represent any data stream that contains information regarding video or other image information, audio information, text information and/or other information. A feature of a media file is a property of the video, audio and/or text information in the media file, such as video or audio format, or information relating to the content of the media file, such as the identity of a speaker depicted in a video sequence, occurrences of applause, video shot boundaries, or motion depicted in a video sequence. For example, Pfeiffer et al., xe2x80x9cAutomatic Audio Content Analysisxe2x80x9d ACM MULTIMEDIA 96, Boston, MA, 1996, pp. 21-30; Wilcox, et al., xe2x80x9cSegmentation of Speech Using Speaker Identificationxe2x80x9d, Proc. ICASSP 94, vol. S1, Apr. 1994, pp. 161-164; and Foote, xe2x80x9cRapid Speaker ID Using Discrete MMI Feature Quantisation,xe2x80x9d Expert Systems with Applications, vol. 13, no. 4, 1997, pp. 293-289, describe various methods for identifying audio features, such as music, human speech, and speaker identity. Regarding video data, Boreczky et al., xe2x80x9cComparison of Video Shot Boundary Detection Techniquesxe2x80x9d Proc. SPIE Conf. On Storage and Retrieval for Still Image and Video Databases IV, San Jose, CA, vol. 2670, Feb. 1996, pages 170-179, and Zhang et al., xe2x80x9cAutomatic Partitioning of Full-motion Videoxe2x80x9d, Multimedia Systems, vol. 1, no. 1, 1993, pp. 10-28, disclose methods for identifying shot boundaries (radical changes in video content) and motion. As described in these and other similar references, features in a media file can be identified automatically using any of a number of different techniques.
Providing feature information in a media browsing system can be very useful for a user when identifying areas of interest in a media file, controlling media playback, editing a media file, or performing other operations with a media file. For example, graphically identifying areas in a media file where a particular speaker is shown on a video clip can allow a user to quickly determine and playback those portions that contain the speaker.
Providing feature information to the user based on automatically identified features also eliminates the need for a user to manually index or otherwise mark significant portions of the media file for later retrieval. Thus, the invention can use existing methods for automatically identifying features in a media file to generate and provide feature information to a user to aid the user in browsing the media file.
The invention provides a media browser that uses media feature information as an aid in navigating, selecting, editing, and/or annotating a media file.
In one aspect of the invention, media features are selected by a user.
In one aspect of the invention, media browsing functions, such as play, rewind, stop, fast-forward, index, automatic slide show, and automatic preview, are controlled based on feature information.
In one aspect of the invention, feature information for a selected feature is mapped to a corresponding confidence score.
In one aspect of the invention, the media browser includes a feature indicator that provides information related to a corresponding selected feature based on a corresponding confidence score.
In one aspect of the invention, a feature indicator combines at least two confidence scores and provides information based on the combination.
In one aspect of the invention, a feature indicator provides information related to a confidence score based on a value of another confidence score.
The invention also provides a method for browsing a media file. A feature of the media file being browsed is selected and information related to a confidence score for at least one selected feature is provided. The confidence score relates to the existence of a corresponding selected feature in the media file. Based on the information related to the confidence score, a portion of the media file is selected.
In one aspect of the invention, a metadata value representing a time-wise evaluation of a feature in the media file is mapped to a corresponding confidence score.
In one aspect of the invention, mapping of a metadata value to a corresponding confidence score is non-linear.
In one aspect of the invention, mapping of a metadata value to a confidence score is dependent on a user-defined control value or values.