1. Field of the Invention
The present invention relates generally relates to compression of digital visual images, and more particularly to an architecture for real-time encoding of a video sequence.
2. Discussion of the Prior Art
Within the past decade, the advent of world-wide electronic communications systems has enhanced the way in which people can send and receive information. In particular, the capabilities of real-time video and audio systems have greatly improved in recent years. In order to provide services such as video-on-demand an video conferencing to subscribers, an enormous amount of network bandwith is required. In fact, network bandwith is often the main inhibitor in the effectiveness of such systems.
In order to overcome the constraints imposed by networks, compression systems have emerged. These systems reduce the amount of video and audio data which must be transmitted by removing redundancy in the picture sequence. At the receiving end, the picture sequence is uncompressed and may be displayed in real-time. One example of an emerging video compression standard is the Moving Picture Experts Group (xe2x80x9cMPEGxe2x80x9d) standard. Within the MPEG standard, video compression is defined both within a given picture and between pictures. Video compression within a picture is accomplished by conversion of the digital image from the time domain to the frequency domain by a discrete cosine transform, quantization, and variable length coding. Video compression between pictures is accomplished via a process referred to as motion estimation and compensation. Motion estimation covers a set of techniques used to extract the motion information from a video sequence. The process of motion estimation effectively reduces the temporal redundancy in successive video frames by exploiting the temporal correlation (similarities) that often exists between successive frames. The MPEG syntax specifies how to represent the motion information: one or two motion vectors per 16xc3x9716 sub-block of the frame depending on the type of motion compensation: forward predicted, backward predicted, average. The MPEG draft, however, does not specify how such vectors are to be computed. Because of the block-based motion representation block-matching techniques are likely to be used. Block matching generally involves determining the direction of translatory motion of the blocks of pixels from one frame to the next by computing a motion vector. The motion vector is obtained by minimizing a cost function measuring the mismatch between a block in a current frame and multiple predictor candidates (16xc3x9716 pixel blocks) from one or more reference frames. The predictor candidates are contained within a user specified search window in the reference video frame. The extent of the search window and the cost function are left entirely to the implementation. Exhaustive searches where all the possible motion vectors are considered, are known to give good results, but at the expense of a very large complexity for large ranges. Ignoring large search ranges, however, can significantly degrade the end result of the search in certain situations. Consider fast moving objects such as race car sequences where the transitory motion (displacement) of the objects from frame to frame is large. Block matching techniques using conventionally sized search windows would fail to capture the object in a reference frame because the objects displacement would place it outside the bounds of the search window in the reference frame. Block matching would yield an inferior motion vector in such a case.
A second situation in which the restrictions of a conventionally sized search window become apparent is that of encoding NTSC format video frames to HDTV format. In this case, the video frames of the HDTV format are extended by a factor of two in both the horizontal and vertical dimensions. The search window size, however, is fixed and as a consequence certain objects would necessarily fall outside the window in an HDTV reference frame as a result of the dimensionality differences. Therefore, there exists a need for an expanded search window, which does not require an inordinate amount of processing time to cover a larger search area in a reference frame thereby yielding more optimal motion vector results.
It is therefore an object of the present invention to provide an apparatus which will expand the size of the search window utilized in the motion estimation function of a full function MPEG encoder beyond its design limitations through the use of multiple identical encoders.
It is another object of the present invention to minimize the number of Input/Output ports required to support the apparatus.
It is yet another object of the present invention to minimize the power consumption associated with the apparatus.
It is still a further object of the present invention to maximize the transparency of the additional functional units required to support the apparatus.
In order to attain the above objects, according to the present invention, there is provided an apparatus which utilizes multiple identical MPEG semiconductor IC encoders (hereinafter encoder) coupled in a series configuration.
In one aspect of the present invention the search window is extended in the horizontal or vertical direction by using two MPEG encoders coupled in series.
In a further aspect of the present invention, the search window is extended in both the horizontal and vertical directions by using four MPEG encoders coupled in series.
The allowable encoder configurations are one encoder, two encoders for extended horizontal searching, two encoders for extended vertical searching, and four encoders for extended horizontal and vertical searching. For each multiple encoder configuration (e.g. 2 or 4 encoders in series) one encoder will be designated as a master encoder, with binary address 001, and satisfies all of the functionality required in a conventional single encoder configuration. Each additional encoder is referred to as a slave encoder and is separately identifiable via a three bit binary address. The binary address is required when the extended search window is subdivided in the multiple encoder configuration and each subdivision must be allocated to the proper encoder for analysis.
In a typical operation a user would define, via file creation or similar means, 1) a search window size (e.g. by defining a horizontal and vertical pixel width), and 2) the number of encoders to use. When either a horizontal or vertical window extension is requested, beyond the extendible bounds of a single encoder configuration, two encoders are required to satisfy the request. Extending the window in both directions simultaneously requires four encoders.
Regardless of which dimension or degree of extension is desired by a user, all searches are performed in parallel in the multiple encoder configurations by dividing the expanded search window in proportion to the number of encoders in use. As a result, overall encoder performance is not affected by searching a much larger search window. In addition, the utilization of multiple encoders remains transparent to all units on both the master encoder and the slave encoder(s).
The various features of novelty which characterize the invention are pointed out with particularity in the claims annexed to and forming a part of the disclosure. For a better understanding of the invention, its operating advantages, and specific objects attained by its use, reference should be made to the drawings and descriptive matter of which there are illustrated and described preferred embodiments of the invention.