It is known in the prior art to encode and transmit multimedia content for distribution within a network. For example, video content may be encoded as MPEG video wherein pixel domain data is converted into a frequency domain representation, quantized and entropy encoded and placed into an MPEG stream format. The MPEG stream can then be transmitted to a client device and decoded and returned to the spatial/pixel domain for display on a display device.
The encoding of the video may be spatial, temporal or a combination of both. Spatial encoding generally refers to the process of intra-frame encoding wherein spatial redundancy (information) is exploited to reduce the number of bits that represent a spatial location. Spatial data is converted into a frequency domain over a small region. In general for small regions it is expected that the data will not drastically change and therefore there much of the information will be stored at DC and low frequency components with the higher frequency components being at or near zero. Thus, the lack of high frequency information of small area is used to reduce the representative data size. Data may also be compressed using temporal redundancy. One method for exploiting temporal redundancy is through the calculation of motion vectors. Motion vectors establish how objects or pixels move between frames of video. Thus, a ball may move between a first frame and a second frame by a number of pixels in a given direction. Thus, once a motion vector is calculated, the information about the spatial relocation of the ball information from the first frame to the second frame can be used to reduce the amount of information that is used to represent the motion in an encoded video sequence. Note that in practical applications the motion vector is rarely a perfect match and an additional residual is sometimes used to compensate for the imperfect temporal reference.
Motion vector calculation is a time consuming and processor intensive step in compressing video content. Typically, a motion search algorithm is employed to attempt to match elements within the video frames and to define motion vectors that point to the new location that objects or portions of objects. This motion search algorithm compares macroblocks (i.e., tries to find for each macroblock the optimal representation of that macroblock in past and future reference frames to a certain criterion), and determines the vector to represent that temporal relation. The motion vector is subsequently used (i.e., to minimize the residual that needs to be compressed) in the compression process. It would be beneficial if a mechanism existed that assists in the determination of these motion vectors.
As appreciated by those skilled in the art, another expensive component of the encoding process for more advanced codecs is the process to find the optimal macroblock type, partitioning of the macroblock and the weighing properties of the slice. H.264, for example, has 4 16×16, 9 8×8 and 9 4×4 luma intra prediction modes, 4 8×8 chroma intra prediction modes and inter macroblocks can be partitioned from as coarse as 16×16 to as fine grained as 4×4. In addition to that, it is possible to assign a weight and offset to the temporal references. A mechanism that defines or assists in finding these parameters directly improves scalability.