A video decoder receives encoded video data and decodes and/or decompresses the video data. The decoded video data comprises a series of pictures. A display device displays the pictures. The pictures comprise a two-dimensional grid of pixels. The display device displays the pixels of each frame in real time at a constant rate. In contrast, the rate of decoding can vary considerably for different video data. Accordingly, the video decoder writes the decoded pictures in a frame buffer.
Among other things, a display engine is synchronized with the display device and provides the appropriate pixels to the display device for display. The display engine provides the appropriate pixels from the frame buffer to the display device. The location of the appropriate pixels in the frame buffer is dependent on the manner that the video decoder writes the pictures to the frame buffer.
Characteristics that characterize the manner that the video decoder writes the picture to the frame buffer include the packing of luma and chroma pixels, the linearity that the frame is stored, and the spatial relationship between the luma and chroma pixels. The foregoing characteristics are usually determined by the original format of the source video data.
The luma and chroma pixels of a picture can either be stored together or separately. The chroma pixels include chroma red difference pixels Cr, and chroma blue difference pixels Cb. In macroblock format, the luma Y pixels are stored in one array, while both chroma pixels Cr/Cb are stored together in another array. In planar format, the luma pixels Y are stored in one array, the chroma Cr pixels are stored in a second array, and the chroma Cb pixels are stored in a third array. In packed YUV format, the luma pixels and both the chroma Cr/Cb pixels are stored together in a single array.
In the packed YUV format, each alternating luma Y pixel is co-located with chroma pixels Cr&Cb in horizontal direction. A picture in the packed YUV format can be divided into units of four pixels, each of the units capable of being stored in a 32-bit word. The four pixels comprise adjacent luma Y pixels and the chroma pixels Cr/Cb co-located with one of the luma Y pixels. The luma Y pixels and the chroma pixels Cr/Cb can be packed in any one of several pixel orders. Examples of pixel orders that the luma Y pixels and chroma pixels Cr/Cb can be packed include, Cb0/Y0/Cr0/Y1, Cr0/Y0/Cb0/Y1, Y0/Cb0/Y1/Cr0, and Y0/Cr0/Y1/Cb0. Additionally, in big endian order, the four bytes are stored in a 32-bit dword as byte0/byte1/byte2/byte3. In little endian order, the four bytes are stored as byte3/byte2/byte1/byte0. Whether bytes are stored in big endian byte order or little endian byte order depends on the hardware characteristics of the frame buffer memory.
The video decoder does not necessarily store the picture in a linear manner. In planar and packed YUV formats, the video decoder stores pictures in linear format i.e., left to right and top to bottom order in the memory. However, in MPEG, DV25, and TM5, pictures are stored in the frame buffer in a macroblock format. In the macroblock format, the pixels of the picture are divided into two dimensional blocks. The video decoder stores the two dimensional blocks in consecutive memory locations.
Additionally, the spatial relationship of chroma pixels to luma pixels can differ among the many standards. Standards defining the spatial relationship of the chroma pixels to luma pixels include MPEG 4:2:0, MPEG 4:2:2, DV-25 4:2:0, and DV-25 4:1:1 to name a few. Where the standards for the display and the decoded video data differ, chroma pixels for the display can be interpolated from two or more chroma pixels in the decoded video data. The standard for the decoded video data is heavily dependent on the format of the source video data.
Conventionally, after each horizontal synchronization pulse, the host processor calculates the address of the first pixels of a line and the parameters for chroma format conversion. The host processor then programs the display engine with the foregoing.
Programming the display engine at each horizontal synchronization pulse consumes considerable bandwidth from the host processor.
Further limitations and disadvantages of conventional and traditional approaches will become apparent to one of skill in the art, through comparison of such systems with embodiments presented in the remainder of the present application with references to the drawings.