1. Field of the Invention
The present invention relates to a cache memory system for a motion estimation circuit used in video processing or video compression applications.
2. Description of the Related Art
Video compression, as performed by MPEG (Motion Picture Coding Experts Group) standards, and other similar systems, is used prior to storage or transmission of video sequences to reduce the data volume or data rate involved. Generally, it has been found that when there is little motion between successive frames, there is a high degree of temporal redundancy between these frames. As such, it is inefficient to store or transmit an entire data block of each frame to reliably recreate the image at the decoder. Instead, the encoder needs only to describe or encode the changes or motion of objects between successive frames. Often this involves motion estimation between portions of successive frames of video. In this way, the efficiency of the transmitting or storage system can be greatly improved by reducing the amount of data to be processed.
Motion estimation is a method of predicting a current frame from a reference frame. A reference frame is any frame other than the current frame, and motion estimation can be used to exploit temporal redundancy between the frames. One of the most common approaches is block-based motion estimation. In this scheme, a frame is divided into blocks of pixels, each block referred to as a “macroblock.” Each pixel has an associated co-ordinate within the frame, as well as an integral value representing luminosity content at that co-ordinate. Each macroblock has an associated co-ordinate, which is usually that of the top-leftmost pixel of the macroblock.
To estimate motion, each macroblock in the current frame (hereinafter called “reference macroblock”) is compared against macroblocks in a region of a reference frame (hereinafter called “search area”). The difference between the co-ordinate of the reference macroblock and the co-ordinate of the macroblock in the search area that best matches the reference macroblock gives the motion vector. Determining the best match usually involves the comparison of a further metric, commonly being the sum of absolute differences between pixels in the reference macroblock and the corresponding pixels in the matched macroblock.
Cache memory is commonly employed to store the search area and reference macroblock to reduce memory access bandwidth. Memory access bandwidth can be further reduced by ensuring a sequential relationship in search areas of sequentially adjacent reference macroblocks. One way of achieving this is to have the same search area offset for reference macroblocks in the same row (also called a slice). The non-overlapping region of search areas corresponding to two adjacent reference macroblocks in the same slice has exactly the width as one macroblock and the same height as the search area. Except at the first reference macroblock of each slice, the method described above requires only one macroblock column to be updated to the search area cache for motion estimation of successive reference macroblocks in the same slice. Generally, if the search area size and processing time for motion estimation of every reference macroblock is the same, when processing the last reference macroblock of a current slice, the entire search area of the first reference macroblock of the next slice would have to be loaded to cache, instead of just one macroblock column. This increases memory access bandwidth as well as requiring the cache to be double-buffered.
U.S. Pat. No. 5,696,698, which is incorporated herein by reference in its entirety, describes one such device for addressing a cache memory of a motion picture compression circuit, in which banks of memory are arranged to store the search area, whereby successive motion estimation requires only partial loading of the required search area when the next reference macroblock has a sequential adjacent relationship with respect to the current reference macroblock.
It is found that object motion typically has a wider horizontal range then vertical range. Furthermore, efficiency is increased if forward/backward as well as foreground/background motions are detected in certain cases. This involves performing motion estimation on two search areas for each reference macroblock. Cache which is needed to minimize memory access bandwidth is costly, and it is desirable to provide cache memory as efficiently as possible.
It is difficult to use a simple cache device or method such as described in U.S. Pat. No. 5,696,698 to support two search areas simultaneously. In particular, the two search areas do not necessarily have any relationship in terms of reference frame source or position.