1. Field of Invention
This invention relates to information systems. More particularly, the invention relates to multimedia search and indexing systems for automatic event selection for replay using audio cues and signal threshold levels.
2. Description of Prior Art
In managing intellectual property assets for maximum return, it is common in the media industry to re-purpose assets, particularly video and sound recording assets, in whole or in part, into other products. An example of a re-purposed asset would be, for example, a video recording of a sporting event shown on television; a portion later included in a commercial; and multiple clips used for news or highlight recaps of the event as well as in a CD-ROM game. Given the need to maximize asset return, the content owner is faced with the problem of finding the desired sections of video or audio materials within a given asset or assets. This is the case whether the asset is stored in a computer system or on traditional analog media such as magnetic tape or film. The state of the art for identifying events for re-purposing is automatic scene change detection. This technology identifies the first frame of a scene that is dramatically different than the preceding scene. However, changes of scene may not be well correlated with the section of media that is desired for re-purposing. For example, in a fast moving game like hockey, the events, such as a goal scored or goal missed, or a key player returning to the ice, may not constitute a change of scene.
What is needed is a mechanism for automating the selection of scenes of interest in an event in one context for re-purposing in another context in which the selected events correlate with the scenes and sounds and context of another media product.
Prior art related to re-purposing intellectual property includes the following:
U.S. Pat. No. 5,713,021 issued Jan. 18, 1998 and filed Sep. 14, 1995, discloses a multimedia system which facilitates searching for a portion of sequential data. The system displays neighboring data depending on a requirement when displaying the portion of the data. A view object management means searches view objects stored in a view object storage means depending on a specification of features of a portion of that data. A display/reproduction means displays and reproduces a portion of data corresponding to the view searched by the view object means.
U.S. Pat. No. 5,613,032 issued Mar. 18, 1997, and filed Sep. 2, 1994, discloses a system for recording and playing back multimedia events and includes recording sources, a preprocessor, a delivery processor, and user control units. The system records and plays back multimedia events which entails capturing tracks of various aspects of a multimedia event; coding the tracks into digitized blocks; time stamping each block; and compressing and pre-processing each track as instructed in a source mapping table; transmitting tracks of the multimedia event to the user as requested; and adjusting the delivery track based upon relative time information associated with the new position established after search through a track of the multimedia event.
U.S. Pat. No. 5,621,658 issued Apr. 15, 1997, and filed Jul. 13, 1993, discloses communicating an electronic action from a data processing system via an audio device. At the sending data processing system, an action is converted to a pre-determined audio pattern. The action may be combined with text converted into an audio message and contained in an electronic mail object. The audio patterns are then communicated to the audio device over telephone lines or other communication means. At the receiving end, the audio device records the object. A user can provide the recorded object to a data processing system which then executes the action and converts the text audio patterns back to text. In addition, the action can be converted to text and displayed on the data processing system.
None of the prior art discloses re-purposing intellectual property, e.g., video and sound, in which certain events or sound in one context are automatically selected for use in or with another context, where the selected events correlate with the scenes and sounds in or with the other context.
An object of the invention is a system and method for selecting scenes of interests in an event in one context for incorporation in, or with another context, as a new or modified product.
Another object is a system and method for automatically selecting and correlating scenes of interest in one context, for incorporation in or with another context, as a new or modified product using audio cues for such selection and correlation.
Another object is a system and method for automatically selecting and correlating scenes of interest in one context using audio cues and signal level thresholds for incorporation of the selected scenes in other contexts as a new or modified product.
Another object is a system and method for logically combining different audio cues in selecting scenes of interest in one context for use in different contexts.
Another object is a system and method for creating an Edit Decision List identifying scenes of interest selected in one context for use in another context using audio cues and signal thresholds.
Another object is a system and method for establishing xe2x80x9cstartxe2x80x9d and xe2x80x9cstopxe2x80x9d times in an Edit Decision List for selection of scenes of interest in one context to be used in different contexts.
These and other objects, features and advantages, are achieved in a multimedia search and indexing system which automatically selects events or scenes of interest from any mediaxe2x80x94video, films, soundxe2x80x94for replay in whole, or in part, in other contexts, as a new or modified product. The entire audio track of a recorded event in video, film, sound, etc., is analyzed to determine audio levels or cues within a set of frequency ranges of interest. The frequency ranges indicate different sounds, e.g. a referee whistle; loud shouting or clapping; a bell sound, etc., each sound having a distinctive frequency and indicative of a scene of interest which correlates with a highlight in an event when occurring at a defined audio clip level. Alternatively, the sound level may drop dramatically as indicative of a scene of interest. Multiple frequency ranges may be analyzed for audio cues in refining the identification of a scene of interest. An Edit Decision List (EDL) of scenes of interest is generated from the analysis of the audio track in which the frequency ranges and audio levels match the criteria for a scene of interest. The list includes xe2x80x9cstartxe2x80x9d and xe2x80x9cstopxe2x80x9d times related to the time codes in the track of the media for locating the scenes of interest as a visual clip. The visual clips are reviewed and accepted or rejected for re-purposing. Once selected, the visual clips are edited using industry standard audio and video editing techniques.
The foregoing objects, features and advantages will be further understood from a detailed description of a preferred embodiment taken in conjunction with the appended drawing, in which:
FIG. 1A is a block diagram of an illustrative system for multimedia searching and indexing using audio cues and signal level thresholds and incorporating principles of the present invention.
FIG. 1B is an alternative system for multimedia searching and indexing using audio cues and signal level thresholds.
FIG. 2 is a representation of a visual tape and accompanying sound track indicating events of interest for re-purposing in another context as a new or modified product.
FIG. 3 is a flow diagram of a selection process for scenes of interest in the visual media of FIG. 2 using the system of FIG. 1A or B.
FIG. 4 is a flow diagram of an audio analysis conducted in the process of FIG. 3.
FIG. 5 is a flow diagram for setting audio parameters for selection of scenes of interest in the process of FIG. 3.
FIG. 6 is a flow diagram for creating an Edit Decision List (EDL) in the process of FIG. 3.
FIG. 7 is a reproduction of an Edit Decision List (EDL).