This invention relates to a method of conducting fast motion searching in advanced video signal coding systems and, more particularly, to methods of searching using a subset of multiple reference frames and/or a subset of multiple image block modes.
A video information format provides visual information suitable to activate a television screen, or be stored on a video tape. Generally, video data is organized in a hierarchical order. A video sequence is divided into groups of frames, and each group can be composed of a series of single frames. Each frame is roughly equivalent to a still picture, with the still pictures being updated often enough to simulate a presentation of continuous motion. A frame is further divided into macroblocks. In H.26P and MPEG-X standards (Moving Picture Experts Group), a macroblock is made up of 16xc3x9716 pixels, depending on the video format. A macroblock always has an integer number of blocks, such as an 8xc3x978 pixel coding unit.
Video compression is a critical component for any application which requires transmission or storage of video data. Compression techniques compensate for motion by reusing stored information in previous frames. This technique is referred to as temporal redundancy. Compression also occurs by transforming data in the spatial domain to the frequency domain.
Motion compensation is a fundamental technique used in video compression such as defined by the Moving Picture Experts Group (MPEG) and International Telecommunications Union (ITU) standards. Motion estimation is perhaps the most demanding task of a video encoder. Many algorithms and techniques have been proposed in the past for conducting fast motion searches. However, these methods apply various fast search strategies for certain single block modes, e.g., such as an 8xc3x978 block mode, within only one single reference frame. None of the prior art methods known to Applicant have considered conducting a fast search with multiple reference frames and multiple image block modes, which is becoming one of the latest techniques in video coding. For example, in the ongoing ITU-T H.26L video-coding standard, up to seven block modes are considered. Moreover, there is no theoretical limit on the number of reference frames that may be considered during a motion search. These latest techniques improve video coding efficiency by providing better motion compensation. However, these techniques also increase the computational burden significantly, especially for the motion search.
In particular, traditional fast motion search techniques use a single reference frame and coding mode. The reference frame that is used and the coding mode are each specified before the search is conducted. Direct application of this method to a search of multiple reference frames and multiple modes, multiplies the complexity of the search by the number of reference frames and modes. For example, a motion search with seven block modes and five reference frames, which is a typical configuration in H.26L, requires thirty-five traditional motion searches. Even if fast search algorithms are employed for each of the frame searches, the complexity is multiplied by thirty-five.
The method of the present invention simplifies the motion search by reducing the number of frames and modes searched without a significant loss in coding performance. The invention provides a fast motion search method based on a reference-frame prediction and a block-mode prediction so that the motion search of each image block is not required to search all of the reference frames and all of the block modes. In particular, a reference frame prediction fp, spaced from the current frame by xe2x80x9cpxe2x80x9d number of frames, can be determined by:
p=min(nxe2x88x921, p0+max(a,b,c,d));
wherein p0 is a pre-chosen positive integer (i.e., an addition factor), n is the total number of reference frames, wherein A, B, C and D are image blocks adjacent to searched block E, and wherein the reference image blocks for image blocks A, B, C and D have been chosen from reference frames fa, fb, fc and fd. The search is conducted within frames f0 to fp, which is a subset of all the n reference frames, so that the total computational burden is significantly decreased with respect to prior art searches.
For an image block being coded, such as block E, the block mode selection can be based on the block modes in the neighboring blocks, A, B, C and D, which have been coded in the modes of mA, mB, mC and mD. The frequency of each image block mode Fm is the number of times the block mode m is used for all the blocks in the previous w frames and for the blocks in the current frame that have been coded. The mode frequency prediction is then made based on the frequencies of the block modes:
F0=xcex1xc2x7min(FmA, FmB, FmC, FmD);
wherein xcex1 is a positive parameter less than 1.0 (i.e., a multiplication factor). The blockmode selection can then be conducted using the mode-frequency prediction. Each mode m among all the M possible modes will be considered if Fm is greater than or equal to F0. If Fm is less than F0 then that particular mode m will be skipped during the motion search.
In particular, the invention comprises: in a digital video system where a video sequence is represented by a series of frames, including a current frame and multiple previous reference frames positioned rearwardly in time with respect to said current frame, each separated by a predetermined time interval, the frames being divided into a plurality of blocks with predetermined positions, with each block including a predetermined matrix of pixel data, a method of efficiently estimating a change in position of an image represented by a matrix of pixel data in an image block in the current frame from corresponding matrices of pixel data in a previous frame of said series of reference frames, by determining the location of an optimal reference block within said series of reference frames, wherein said optimal reference block corresponds to said image block, the method comprising the steps of: selecting an image block in the current frame; selecting a number of reference frames; selecting a number of blocks adjacent to said image block in the current frame; selecting a value for an addition factor; for each of said selected blocks adjacent to said image block in the current frame, determining a reference image block in one of said number of reference frames; calculating a subset of frames of said number of reference frames in which to search for said optimal reference block, wherein said subset of frames comprises multiple frames positioned rearwardly in time from said current frame, wherein the calculation comprises choosing the minimum of either the number of reference frames minus one, or the addition factor plus the maximum of the number of frames counted rearwardly in time from said current frame to reach the frame containing the reference image block in said one of said number of reference frames for each of said reference image blocks; and searching the subset of frames for said optimal reference block.
The invention further comprises in a digital video system where a video sequence is represented by a series of frames, including a current frame and multiple previous reference frames positioned rearwardly in time with respect to said current frame, each separated by a predetermined time interval, the frames being divided into a plurality of blocks with predetermined positions, with each block including a predetermined matrix of pixel data, a method of efficiently estimating a change in position of an image represented by a matrix of pixel data in an image block in the current frame from corresponding matrices of pixel data in a previous frame of said series of reference frames, by determining the location of an optimal reference block within said series of reference frames, wherein said optimal reference block corresponds to said image block, the method comprising the steps of: selecting an image block in the current frame; selecting a number of reference frames; selecting a number of blocks adjacent to said image block in the current frame; selecting a number of image block modes; determining the mode of each of said selected number of blocks adjacent to said image block in the current frame; determining the frequency of each image block mode within said number of reference frames; selecting a multiplication factor; calculating a mode-frequency prediction factor by multiplying the multiplication factor by the minimum one of the frequency of each image block mode; calculating a subset of modes of said number of image block modes in which to search for said optimal reference block, wherein said subset of modes comprises each of the modes of said number of image block modes when the frequency of each of said modes of said number of image block modes is greater than or equal to said mode-frequency prediction factor, and wherein said subset of modes excludes a particular mode of said number of image block modes when the frequency of the particular mode is less than said mode-frequency prediction factor; and searching the subset of modes of said number of image block modes for said optimal reference block.
The invention also comprises in a digital video system where a video sequence is represented by a series of frames, including a current frame and multiple previous reference frames positioned rearwardly in time with respect to said current frame, each separated by a predetermined time interval, the frames being divided into a plurality of blocks with predetermined positions, with each block including a predetermined matrix of pixel data, a method of efficiently estimating a change in position of an image represented by a matrix of pixel data in an image block in the current frame from corresponding matrices of pixel data in a previous frame of said series of reference frames, by determining the location of an optimal reference block within said series of reference frames, wherein said optimal reference block corresponds to said image block, the method comprising the steps of: selecting an image block in the current frame;
selecting a number of reference frames; selecting a number of blocks adjacent to said image block in the current frame; selecting a value for an addition factor; for each of said selected blocks adjacent to said image block in the current frame, determining a reference image block in one of said number of reference frames; calculating a subset of frames of said number of reference frames in which to search for said optimal reference block, wherein said subset of frames comprises multiple frames positioned rearwardly in time from said current frame, wherein the calculation comprises choosing the minimum of either the number of reference frames minus one, or the addition factor plus the maximum of the number of frames counted rearwardly in time from said current frame to reach the frame containing the reference image block in said one of said number of reference frames for each of said reference image blocks; selecting a number of image block modes; determining the mode of each of said selected number of blocks adjacent to said image block in the current frame; determining the frequency of each image block mode; selecting a multiplication factor; calculating a mode-frequency prediction factor by multiplying the multiplication factor by the minimum one of the frequency of each image block mode; calculating a subset of modes of said number of image block modes in which to search for said optimal reference block, wherein said subset of modes comprises each of the modes of said number of image block modes when the frequency of each of said modes of said number of image block modes is greater than or equal to said mode-frequency prediction factor, and wherein said subset of modes excludes a particular mode of said number of image block modes when the frequency of the particular mode is less than said mode-frequency prediction factor; and searching the subset of frames of said number of reference frames and searching the subset of modes of said number of image block modes for said optimal reference block.
Accordingly, an object of the invention is to provide a method of conducting a fast motion search in advanced video coding.
Another object of the invention is to provide a method of conducting reference frame prediction and/or block mode prediction in a fast motion search.
A further object of the invention is to provide a method of conducting a search including a subset of multiple reference frames and/or a subset of multiple image block modes.