1. Field of the Invention
The present invention relates to two-dimensional caching in computer memory devices, and, in particular, to processes, apparatuses, and systems for using two-dimensional caching to perform motion estimation in video processing.
2. Description of the Related Art
In video processing, motion estimation is often employed to exploit the temporal redundancy in sequential frames of video signals. Motion estimation determines the location of portions of one video frame that correspond to similar portions of another video frame. Motion estimation is used in video encoding to reduce the number of bits used to represent the video signals for efficient storage and/or transmission.
Referring now to FIG. 1, there is shown a current frame 10 and a previous frame 12 for a conventional motion estimation scheme. In conventional motion estimation, the video signals of the current frame 10 are compared to the video signals of the previous frame 12 to determine the location of portions of the previous frame that correspond to portions of the current frame. For purposes of this application, video signals represent, for example, the amplitudes of video images at particular locations and, in that case, are equivalent to pixel values. The current frame is divided into reference blocks such as reference block 14. The video signals of each reference block are then compared to the video signals of a corresponding, but larger, search region of the previous frame 12 to determine the block of the previous frame 12 that most closely matches the reference block of the current frame 10, using selected comparison criteria. For example, the video signals of reference block 14 of the current frame 10 may be compared to the video signals of search region 16 of the previous frame 12.
Referring now to FIGS. 2 and 3, there are shown graphical representations reference block 14 of current frame 10 of FIG. 1 and the corresponding search region 16 of previous frame 12 of FIG. 1, respectively. As shown in FIGS. 1 and 2, the upper left corner of reference block 14 is located at pixel (i,j) of current frame 10. Reference block 14 has a height of h rows and width of w columns.
Typical motion estimation schemes such as the H.261 (P.times.64) and ISO/IEC 11172-2 (MPEG) standards set limits on the maximum vertical distance d.sub.i and maximum horizontal distance d.sub.j between a reference block of the current frame and the "most closely matching" block of the previous frame. Thus, the upper left corner (i-d.sub.i,j-d.sub.j) of search region 16 of FIG. 3 corresponds to the maximum motion in the negative vertical (i.e., up) and negative horizontal (i.e., left) directions. Similarly, the lower right corner (i+h+d.sub.i,j+w+d.sub.j) of search region 16 corresponds to the maximum motion in the positive vertical (i.e., down) and positive horizontal (i.e., right) directions. With these limits on maximum vertical and horizontal distances, search region 16 contains all of the possible locations of the block of the previous frame 12 that most closely matches the reference block 14 of the current frame 10.
For example, the maximum horizontal and vertical distances between a reference block and the corresponding subregion may be eight pixels. If the current frame is divided into (16.times.16) reference blocks, then each (16.times.16) reference block is compared to a (32.times.32) search region of the previous frame, where the location of the center of the reference block corresponds to the location of the center of the search region, as shown in FIG. 1. Those skilled in the art will understand that reference blocks near the frame edges are typically compared to search regions that are smaller than (32.times.32) pixels.
In an exhaustive motion estimation search, (16.times.16) reference block 14 is compared to each of the 17.sup.2 or 289 different (16.times.16) blocks of search region 16 to determine the block in search region 16 that best matches the reference block. For image encoding, a motion vector corresponding to the distance between the reference block of the current frame and the "best match" search block of the previous frame may then be used to encode the reference block of the current frame.
One conventional way to implement the exhaustive motion estimation search is to load the video signals corresponding to the (16.times.16) reference block into a first area (i.e., reference area) of computer memory and the video signals corresponding to the entire (32.times.32) search region into a second area (i.e., search area) of the computer memory. The video signals in the reference area are then compared to the 289 different sets of (16.times.16) video signals in the search area, one set at a time. The sequence of comparisons may, for example, follow a raster scan pattern (i.e., moving from left to right along rows and from top to bottom along columns of the search area).
For 8-bit video signals, such a search mechanism requires (16.times.16) or 256 bytes for the reference area of the computer memory to hold the reference block and (32.times.32) or 1024 bytes for the search area of the computer memory to hold the entire search region. When the motion estimation search is complete for that reference block and that search region, the video signals for the next (16.times.16) reference block are loaded into the reference area and the video signals for the corresponding (32.times.32) search area are loaded into the search area. Those skilled in the art will understand that such motion estimation processing uses two-dimensional caching (i.e., accessing two-dimensional blocks of signals for processing).
In order to perform motion estimation efficiently, it is desirable to reduce the size of the areas of computer memory used. This is especially true where the computer memory is on-chip memory, although it may also apply when an external memory device is used. In addition, it is desirable to reduce the amount of signal transfer into the computer memory during the motion estimation processing. It is also desirable to spread the signal transfer over the processing sequence to provide more uniform signal transfer. These two factors (i.e., limited on-chip memory and limited I/O bandwidth) limit the range of motion estimation that can be processed in real time with a given hardware design.
To increase the motion estimation range with a given processor complexity (i.e., with a given transistor budget), what is needed is a motion estimation scheme that reduces the memory usage, reduces the signal transfer rate, and uses a more uniform signal transfer.
It is accordingly an object of this invention to overcome the disadvantages and drawbacks of the known art and to provide processes, apparatuses, and systems for performing motion estimation with reduced computer memory usage, reduced signal transfer, and more uniform signal transfer.
Those skilled in the art will understand that these goals apply to other computer processes, in addition to motion estimation in video processing, that use two-dimensional caching. It is therefore a general object of this invention to provide improved processes, apparatuses, and systems for performing computer processing using two-dimensional caching.
Further objects and advantages of this invention will become apparent from the detailed description of a preferred embodiment which follows.