1. Field of the Invention
The present invention relates to an apparatus for detecting a gradual scene change applied to an automatic video analysis device for configuring a digital video library and a method thereof, more particularly to a detecting apparatus which can accurately detect a gradual scene change and execute a real-time processing, a method for performing the same, and a medium which can record a program for accomplishing the method and can be read by using a computer.
2. Description of the Prior Art
Generally, a video library stores much image data and the related various image indexing data as digital data, so users connect to a communication network such as the internet and retrieve the demanded image data through the indexing data by using the video library, and then the users receives and utilizes the searched image data. A video analysis device is a prerequisite for developing and maintaining the video library. The video analysis device should accurately detect a video scene change so that the resultant video clips are indexed and stored into a video database.
According to the conventional method for detecting the video scene change, several technologies are disclosed such as a twin comparison approach, a plateaus detection of a delayed frame metric, an image variance valley detection and a video edit model approach. Such technologies greatly depend on a specific parameter requiring a close control or a selection of a threshold from a viewpoint of their efficiency. Also, those technologies may not detect a gradual scene change including fade-in, fade-out and dissolve. Furthermore, the conventional methods may not detect the gradual scene change or they cannot accurately discriminate the duration of the scene change though they may detect the scene change.
Considering the above-mentioned problems, it is an object of the present invention to provide an apparatus for accurately detecting a gradual scene change and executing a real-time processing.
It is another object of the present invention to provide a method for detecting the gradual scene change and executing the real-time processing which is applied to the apparatus.
It is still another object of the present invention to provide a medium for recording medium for storing a program that accomplishes the above method and is read by using a computer.
It is still another object of the present invention to provide an apparatus for greatly enhancing a performance of an auto video analysis device used for implementing a video library demanded by a multimedia service.
It is still another object of the present invention to provide a method for accomplishing an enhanced performance of the auto video analysis device used for implementing the video library demanded by the multimedia service.
Also, it is still another object of the present invention to provide a recording medium for storing a program that accomplishes the method for accomplishing the enhanced performance of the auto video analysis device.
To achieve the above objects of the present invention, there is provided a gradual scene change detector for detecting a gradual scene change, comprising: a video pre-processor for decoding an image sequence of a digital video signal externally applied, for vectorizing; a video main processor for determining a state of the image sequence based on a distance between frames of image sequence inputted from the video preprocessor so as to declare a temporal dissolve, and for detecting an initial frame position and a final frame position of the declared dissolve; and a video post-processor for merging the declared temporal dissolve in accordance with the distance between the declared temporal dissolves by the video main processor, and for declaring a dissolve based on the distance between the initial frame position and the final frame position and a duration.
In preferred embodiments, the video pre-processor comprises: a video decoder for decoding an image sequence of the digital video signal; and a video vectorizor for converting the decoded image sequence from the video decoder into vector. The video vectorizor projects or sub-samples the decoded DC image sequence in a predetermined direction so as to perform a data-compression, thereby vectorizing.
The video main processor comprises: a linear image predictor for predicting a linear image based on the vectorized image sequence from the video pre-processor; a first frame distance measurement device for measuring a distance between image frames based on a reference image from the linear image predictor; so as to produce a first measured distance; a second frame distance measurement device for measuring a distance between image frames based on the linear predicted image from the linear image predictor; so as to produce a second measured distance; a subtractor for producing a difference between the first measured distance and the second measured distance; a signal converter for converting the difference from the subtractor in accordance with whether any rapid scene changes is made or not; an accumulator for accumulating the difference of the subtractor applied from the signal converter; and a dissolve declaring/frame detecting device for declaring the temporal dissolve based on an accumulated value from the accumulator, and for detecting the initial frame position and the final frame position of the declared temporal dissolve.
If any rapid scene change is detected, the signal converter converts the difference of the subtractor into xe2x80x9c0xe2x80x9d; if not, the signal converter transfers the difference of the subtractor to the accumulator.
The linear image predictor comprises: a first delay element for delaying the vectorized DC image sequence from the video pre-processor by a predetermined time; a second delay element for re-delaying the delayed image from the first delay element; an adder for adding the vectorized DC image and the re-delayed image from the second delay element; and a multiplier for multiplying the output value from the adder and a coefficient(e.g. xe2x80x981/2xe2x80x99), so as to produce a linear predicted image.
The first frame distance measurement device comprises: a histogram information extracting element for extracting a histogram information based on the delayed reference image from the linear image predictor; a delay element for delaying the extracted histogram information from the histogram information extractor by a predetermined time; and a vector distance measurement device for measuring a vector distance between the extracted histogram information from the histogram information extractor and the delayed histogram information.
The second frame distance measurement device comprises: a histogram information extractor for extracting a histogram information based on the linear predicted image from the linear image predictor; a vector distance measurement device for measuring a vector distance based on the extracted histogram information from the histogram information extractor and an extracted histogram information of the first frame distance measurement device.
The accumulator comprises: a discriminator for discriminating an output signal from the signal converter to xe2x80x980xe2x80x99 or xe2x80x981xe2x80x99; an adder for adding the output signal of the signal converter and a feedback accumulated value (D(ixe2x88x921)); a multiplier for multiplying the output signal of the discriminator and the added value of the adder so as to produce an accumulated value(D(i)); and a delay element for delaying the accumulated value from the multiplier by a predetermined time so as to feedback the delayed value to the adder. The dissolve declaration and the temporal dissolve declaration of the frame detector is performed by: finding an accumulated value such that the highest value of the accumulated value (D(i)) within a duration longer than xe2x80x980xe2x80x99 is higher than a predetermined duration threshold value(ThCLD); if the duration of the found accumulated value (D(i)) is longer than a predetermined continuation threshold value (Thcon), declaring the accumulated value as a temporal dissolve; and otherwise, the declaration being not performed. The video post-processor comprises: a dissolve merging processor for confirming whether the distance between the declared temporal dissolves from the video main processor is smaller than a predetermined merging threshold value, and if so, for merging the smaller dissolves into one; and a dissolve declaration processor for confirming whether the distance between the initial frame position and the final frame position is higher than a predetermined distance threshold value so as to declare a dissolve.
The dissolve merging processor confirms whether the distance between the declared temporal dissolves from the video main processor is smaller than the predetermined merging threshold value (THlink); if so, the dissolve merging processor merges the dissolves having smaller distance into one dissolve; and otherwise, the dissolve merging processor transfers the declared temporal dissolve to the dissolve declaration processor, wherein the dissolve declaration processor confirms whether the distance between the initial frame position and the final frame position of the dissolve merging processor is higher than the predetermined distance threshold value (Thdist); if so, the dissolve declaration processor declares a dissolve when the distance between the initial frame position and the final frame position applied thereto is higher than the predetermined duration threshold value (Thdur).
According to another aspect of this invention, there is provided a method for detecting a gradual scene change comprising the steps of: decoding an image sequence of digital video signal externally applied for vectorization; discriminating the state of the image sequence based on the distance between frames in the vectorized image sequence so as to declare a temporal dissolve and detecting a initial frame position and a final frame position of the declared temporal dissolve; and merging the declared temporal dissolve in accordance with the distances of the declared temporal dissolve and declaring a dissolve in accordance with the distance between the initial and the final frames and the duration. The third step may further comprise: a fourth step of confirming whether the distance between the declared temporal dissolves is a predetermined merging threshold value (Thlink); a fifth step of merging the dissolves having the smaller distance into one, if the result of the step of the fourth step is positive, and otherwise transferring the declared temporal dissolve as it is; a sixth step of confirming whether the distance between the initial frame and the final frame of the transferred dissolve is higher than a predetermined distance threshold value (Thres) and the distance between the initial frame position and the final frame position of the transferred dissolve is higher than a predetermined duration threshold value; and a seventh step of declaring a dissolve if the result of the sixth step is positive.
According to still another aspect of this invention, there is provided a computer program device readable by a machine, tangibly embodying a program of instructions executable by the machine to perform a method comprising: a first function of decoding an image sequence of digital video signal externally applied, so as for vectorization; a second function of discriminating a state of the image sequence based on a frame distance between the vectorized image sequence so as to declare a temporal dissolve and detecting a initial frame position and a final frame position of the declared temporal dissolve; and a third function of merging the declared temporal dissolves in accordance with the distance between the declared temporal dissolves and declaring the dissolve in accordance with a distance and a duration between the initial frame and the final frame.