Moving Pictures Experts Groups (MPEG) is an International Standards Organization (ISO) standard for compressing video data. Video compression is important in making video data files, such as full-length movies, more manageable for storage (e.g., in optical storage media), processing, and transmission. In general, MPEG compression is achieved by eliminating redundant and irrelevant information. Because video images typically consist of smooth regions of color across the screen, video information generally varies little in space and time. As such, a significant part of the video information in an image is predictable and therefore redundant. Hence, a first objective in MPEG compression is to remove the redundant information and leaving only the true or unpredictable information. On the other hand, irrelevant video image information is information that cannot be seen by the human eye under certain reasonable viewing conditions. For example, the human eye is less perceptive to noise at high spatial frequencies than noise at low spatial frequencies and less perceptive to loss of details immediately before and after a scene change. Accordingly, the second objective in MPEG compression is to remove irrelevant information. The combination of redundant information removal and
irrelevant information removal allows for highly compressed video data files.
MPEG compression incorporates various well-known techniques to achieve the above objectives including: motion-compensated prediction, Discrete Cosine Transform (DCT), quantization, and Variable-Length Coding (VLC). DCT is an algorithm that converts pixel data into sets of spatial frequencies with associated coefficients. Due to the non-uniform distribution of the DCT coefficients wherein most of the non-zero DCT coefficients of an image tend to be located in a general area, VLC is used to exploit this distribution characteristic to identify non-zero DCT coefficients from zero DCT coefficients. In so doing, redundant/predictable information can be removed. Additionally, having decomposed the video image into spatial frequencies under DCT means that higher frequencies via their associated DCT coefficients can be coded with less precision than the lower frequencies via their associated DCT coefficients thereby allowing irrelevant information to be removed. Hence, quantization may be generalized as a step to weight the DCT coefficients based on the amount of noise that the human eye can tolerate at each spatial frequency so that a reduced set of coefficients can be generated.
Compressed video data is vulnerable to transmission errors. MPEG-4 offers error resilience tools to localize the effects of errors, re-establish synchronization, and recover erroneous data. The end result is more reliable data transmission. These tools include data partition, packetization, and reversible VLC (RVLC). Data partitioning is designed to localize and isolate the effects of errors by separating and partitioning motion and shape data from texture data in a video packet. The data partition mode utilizes DC-markers (for intra-frames) and motion markers for (inter-frames) to achieve these objectives. The data partition mode also involves a different way to code the coefficients. A video packet is made up of one or several macroblocks. A frame (a.k.a. Video Object Plane in MPEG-4 terminology) may consist of zero, one, or several packets. Each packet starts with markers and the packet header. The data in each packet are encoded independently relative to other packets. Data partition mode in MPEG-4 requires data in any packet to be divided into three parts. Each part consists of bitstream components from all macroblocks in the packet. During data partition mode, a packet size (i.e., the number of data bits in the packet) is limited to 2048 bits for simple profile level-1 video bitstream, 4096 bits for simple profile level-2 video bitstream, and 8192 bits for simple profile level-3 video bitstream.
Video packetization mode utilizes Resynchronization Marker (RSM) and Header Extension Code (HEC) before the first macroblock during encoding. When data is corrupted or damaged, during the decoding process, the non-recoverable data can be localized and discarded until the next RSM. In the event the VOP code is corrupted, HEC provides additional information to enable the decoder to determine to which VOP a resync packet belongs. RVLC mode requires that texture data to be capable of being decoded in both the forward and reverse directions thereby enabling the decoder to better localize the error between two RSMs. This is achieved through the use of prefix property (same as regular VLC) and suffix property.
Under MPEG-4, there are different bit packing formats for output VLC data. In the bypass mode data is encoded only at the macroblock layer. Hence, data is formatted such that a macroblock header precedes the macroblock data. FIG. 1A illustrates as an example the bypass bit packing format wherein MB0hdr is the header associated with macroblock 0, MB0data is the data associated with macroblock 0, MB1hdr is the header associated with macroblock 1, MB1data is the data associated with macroblock 1, and so on. In the VLC mode with no data partition, data is formatted as illustrated in FIG. 1B. As shown in FIG. 1B, a frame header is at the beginning follows by MB0hdr1 the header associated with macroblock 0, MB0hdr2 the motion vector data associated with macroblock 0 (which is needed if an inter-macroblock is involved), MB0data the data/texture associated with macroblock 0. The pattern repeats for subsequent macroblocks. At some point (e.g., after macroblock 7) of the bitstream data, a new data packet begins with a packet header which is followed by MB8hdr1 the header associated with macroblock 8, MB8hdr2 the motion vector data associated with macroblock 0 (which is needed if an inter-macroblock is involved), MB8data is the data/texture associated with macroblock 8, and the pattern described above is repeated.
In the VLC mode with data partition, data can be formatted three different ways as illustrated in FIG. 1C-1E. In the first format which is designed to accommodate an intra-macroblock in an I-frame, the six DC coefficients for the different blocks in a macroblock are included together with the header data 1. More particularly, as shown in FIG. 1C, a frame header is at the beginning to be followed by header data 1 with the DC coefficients, header data 2 associated with the motion vector data associated with the present macroblock (which is needed if an inter-macroblock is involved), and finally the texture (macroblock) data. The pattern repeats for subsequent macroblocks. Since data partition is involved, a DC marker is typically inserted between the header data 1 with DC coefficients and the header data 2 if the macroblock type is intra. Motion marker is inserted if the macroblock type is inter. At some point (e.g., after macroblock 3) of the bitstream data, a new data packet begins with a packet header which is followed by header data 1 with the DC coefficients, header data 2, and the macroblock data. The pattern described above is repeated.
In the second format which is designed to accommodate an intra-macroblock in a P-frame, the six DC coefficients for the different blocks in a macroblock are included together with the header data 2. More particularly, as shown in FIG. 1D, a frame header is at the beginning to be followed by header data 1, header data 2 with the DC coefficients, and finally the texture (macroblock) data. The pattern repeats for subsequent macroblocks. Header data 2 is the motion vector data associated with the present macroblock (which is needed if an inter-macroblock is involved). Since data partition is involved, a motion marker is typically inserted between the header data 2 with DC coefficients and the texture data if the macro-block type is inter. DC Marker is inserted if the macroblock type is intra. At some point (e.g., after macroblock 3) of the bitstream data, a new data packet begins with a packet header which is followed by header data, follows by header data 2 with the DC coefficients, follows by the macroblock data.
In the third format which is designed to accommodate an inter-macroblock in a P-frame, the six DC coefficients for the different blocks in a macroblock are included together with the texture (macroblock) data. More particularly, as shown in FIG. 1E, a frame header is at the beginning to be followed by header data 1, header data 2, and the texture (macroblock) data with the DC coefficients. The pattern repeats for subsequent macroblocks. Header data 2 is the motion vector data associated with the present macroblock (which is needed if an inter-macroblock is involved). Since data partition is involved, a motion marker DC marker is typically inserted between the texture data with DC coefficients and the next section of the sequence if the macro-block type is inter. DC Marker is inserted if the macroblock type is intra. At some point (e.g., after macroblock 3) of the bitstream data, a new data packet begins with a packet header which is followed by header data 1, header data 2, and the macroblock data with the DC coefficients. The bit packing formats for the RVLC mode with and without data partition are identical to those described earlier for the VLC mode with and without data partition, respectively.
Wireless data transmission standards such as h.263 have substantially similar bit packing formats as those for VLC and RVLC mode with data partition described above. However, the only difference is that packet headers for the h.263 standard are shorter in length than those for MPEG-4.
Conventionally, to perform bit packing in different formats such as those described earlier, a VLC memory is required to store the output VLC data components (e.g., header data, texture data, etc.). The VLC data components are stored in the VLC memory in which headers and motion vectors are received out of sequence with associated texture data. The memory interface unit selectively accesses the appropriate data stored in the VLC memory one component at a time and then writes it into a different memory location of the same VLC memory used in stitching together the data according to the required format. Hence, this different memory location stores the data as it is being packed/assembled and/or formatted at various phases of completion. At completion, the packed and/or formatted data is then read and written to the desired destination (e.g., a memory). This approach is not desirable because of the large VLC memory required to store the VLC data and the partially assembled/packed data at different stages as well as the intensive processing power required to read and write data components from/to the VLC memory during the assembling/formatting process. Moreover, because additional read and write operations for output at completion are required, additional valuable computing resources are required. Furthermore, the above approach requires a great deal of synchronization because the data components are generated and/or updated at different times.
Thus, a need exists for a method and apparatus to pack VLC video data in different formats that require less memory, processing resources, and synchronization.