The medium of digital video communication is widely used in many applications. Due to the rich information content of video data, queries can be specified not only by video titles, video descriptions, and alpha-numeric attributes of video data, but also by the video contents. Therefore, video index construction for supporting powerful query capabilities is an important research issue for video database systems.
Video segmentation is a fundamental step toward video index construction. Video sequences may be segmented according to so-called "shot changes", which are often used for video browsing. A "shot" is made up of a sequence of video frames which represents a continuous action in time and space. Therefore, the contents of the frames belonging to the same shot are similar. A shot change is defined as a discontinuity between two shots. The similarity (or dissimilarity) measurement of continuous frames may therefore be used for shot change detection.
In the prior art, many varied approaches have been explored in the development of indexing techniques. U.S. Pat. No. 5,212,547, dated May 18, 1993, entitled "Image Processing Device and Method for Sensing Moving Objects and Rangefinder Employing the Same", teaches finding objects in motion in a frame. The video data of a frame is subtracted from the average value of the video data of the frame. This technique does not involve shot change detection, however.
In U.S. Pat. No. 5,327,232, dated Jul. 5, 1994, entitled "Method and Apparatus for Detecting Motion Vectors", the objective is to detect the motion vector of the content of a frame, using an image block matching method. Again, this technique does not utilize shot change detection.
In U.S. Pat. No. 5,488,425, dated Jan. 30, 1996, entitled "Apparatus for Storing Video Information by Recognizing Video Frames", the objective is to select one frame and detect similar frames subsequent to it. Shot change detection is not an objective of this invention.
In U.S. Pat. No. 5,179,449, dated Jan. 12, 1993, entitled "Scene Boundary Detecting Apparatus", the object to be processed is the original video data, rather than the compressed video image frames. As a result, the speed of processing is relatively slow.
In the paper entitled "A Feature-Based Algorithm for Detecting and Classifying Scene Breaks", by Ramin Zabih, et al.ACMMM95!, the subject thesis detects an occurrence of a shot change by observing changes in the positions of the lines in adjacent frames. Thus, the major feature of this method is the use of image analysis, whereby lines in a frame are detected for determination of a shot change. Since the data being processed is the original video data, the speed of processing is relatively slow.
In the paper entitled "Feature Management for Large Video Databases", by Farshid Arman, et al.SPIE93!, the subject thesis deals with DCT-based compressed video data where the DCT multiple parameters are used to determine a shot change. In consecutive frames, an inner product is obtained through calculation of the DCT parameters of the block in the same position. The greater the difference in the frames, the larger the inner product will be. This method is capable of determining a shot change in a timely manner, since it does not analyze the original image data of the frame. However, when the inner product falls within a gray area, so that it is difficult to determine whether or not there is a shot change, the frames must be decompressed and analyzed using original image data. Thus, the processing speed is compromised.
In the paper entitled "Projection Detecting Filter for Video Cut Detection", by Kiyotaka Otsuji and Y. Tonomura ACMMM93!, the subject thesis proposes a process of filtering, whereby the frame variations that are not caused by a shot change are reduced to a minimum, and the determination of the variations of a shot change in the frames is simplified. However, this thesis does not consider the use of compressed video data.
In the paper entitled "Knowledge Guided Parsing in Video Databases", by Deborah Swanberg et al. SPIE93!, the subject thesis proposes the use of a color histogram difference method to determine a shot change. The distribution of the content of a frame is predicted according to different kinds of video data, so as to locate the position of a shot change with accuracy, and to decide the classification of the shot at the same time. The disadvantage of this method is that a knowledge base of the video data must be defined on a case-by-case basis.
Therefore, it is an object of the present invention to overcome the disadvantages of the prior art.