This invention relates to video compression, and more particularly to error recovery from macroblock location errors.
Multimedia-rich personal computers (PC's) and various other computing devices have delivered video feeds to users over the Internet. Processing of video bitstreams or feeds is among the most data-intensive of all common computing applications. Limited communication-line bandwidth limits the quality of Internet video, which is often delivered in small on-screen windows with jerky movement.
To mitigate the problems of large video streams, various video-compression techniques have been deployed. Compression standards, such as those developed by the motion-picture-experts group (MPEG), have been widely adopted. These compression techniques are lossy techniques, since some of the picture information is discarded to increase the compression ratio. However, compression ratios of 99% or more have been achieved with minimal noticeable picture degradation.
Portable hand-held devices such as personal-digital-assistants and cellular telephones are widely seen today. Wireless services allow these devices to access data networks and even view portions of web pages. Currently the limited bandwidth of these wireless networks limits the web viewing experience to mostly text-based portions of web pages. However, future wireless networks are being planned that should have much higher data transmission rates, allowing graphics and even video to be transmitted to portable computing and communication devices.
Although proponents of these next-generation wireless networks believe that bandwidths will be high enough for high-quality video streams, the actual data rates delivered by wireless networks can be significantly lower than theoretical maximum rates, and can vary with conditions and local interference. Due to its high data requirements, video is likely to be the most sensitive service to any reduced data rates. Interference can cause intermittent dropped data over the wireless networks.
Next-generation compression standards have been developed for transmitting video over such wireless networks. The MPEG-4 standard provides a more robust compression technique for transmission over wireless networks. Recovery can occur when parts of the MPEG-4 bitstream is corrupted.
FIG. 1 shows that an image frame is divided into rows and columns of macroblocks. The MPEG standard uses a divide-and-conquer technique in which the video sequence is divided into individual image frames known as video object planes (VOPs), and each frame is divided into rows and columns of macroblocks. Each macroblock is a rectangle of 16 by 16 pixels.
Various window sizes and image resolutions can be supported by MPEG standards. For example, one common format is an image frame of 176 by 144 pixels. The image frame is divided into 9 rows of macroblocks, with each row having 11 macroblocks. A total of 99 macroblocks are contained in each frame.
The macroblocks are arranged in a predetermined order, starting in the upper left with the first macroblock (MB #1). The second macroblock, MB#2, is to the right of MB#1 in the first row, followed by macroblocks #3 to MB#11 in the first row. The second row contains MB#12 to MB#22. The last row contains MB#89 to MB#99. Of course, other image sizes and formats can have the macroblocks in rows of various lengths, and various numbers of rows.
When an image frame is encoded, each macroblock is encoded in order, starting with MB#1 in the first row, and continuing with MB#2 to MB#11 in the first row, then MB#12 to MB#22 in the second row, and on until the last row with MB#89 to MB#99. The macroblocks are arranged in the bitstream into one or more video packets (VP). Each video packet contains a header, allowing for some error recovery to occur. For example, the number of the next macroblock in the frame, which is the first macroblock in the new video packet, is included in the VP header.
FIG. 2 shows a MPEG-4 bitstream that is composed of video object planes and video packets. The video is sent as a series of picture frames known as video object planes (VOP). These picture frames are replaced at a fixed rate, such as every 30 milliseconds to give the illusion of picture movement. Rather than transmit every pixel on each line, the picture is divided into macroblocks and compressed by searching for similar macroblocks in earlier or later frames. The macroblock can then be replaced with a motion vector or pixel changes.
Video object planes VOP 10, 12 are two frames in a sequence of many frames that form a video stream. Pixel data in these planes are compressed using macroblock-compression techniques that are well-known and defined by the MPEG-4 standard. The compressed picture data is divided into several video packets (VP) for each video object plane VOP.
Each video object plane begins with a VOP start code, such as VOP start code 20 which begins VOP #1 (10), an VOP start code 21, which begins VOP #2 (12). First video object plane VOP 10 has VOP header 22 that follows VOP start code 20, and data field 24 which contains the beginning of the picture data for VOP 10. After a predetermined number of macroblocks or amount of data, such as 100 to 1000 bits, a new video packet begins with resync marker 30 and VP header 32. Data field 34 continues with the picture data for several more macroblocks of VOP 10. Other video packets follow, each beginning with a resync marker and VP header, followed by a data field with more macroblocks of picture data for VOP 10. The last video packet VP #N in VOP 10 begins with resync marker 31 and VP header 33, and is followed by the final macroblocks for VOP 10, in data field 35.
The second video object plane VOP 12 begins with VOP start code 21 and VOP header 23, and is followed by data field 25, which has the first macroblocks of picture data for the second picture frame, VOP 12. Other video packets follow for VOP 12 with the other macroblocks.
The VOP headers include a VOP coding type (I for Intra-coded, no prediction, P for prediction from Previous VOP, B for Bi-directional prediction from previous and next VOPs), VOP time, rounding type, quantization scale, f_code, while the VP headers include a macroblock number for the first macroblock in the packet, quantization scale, VOP coding type and time. The headers can include other information as well.
The VOP start codes and VP resync markers contain unique bit patterns that do not occur in the headers or data fields. The start code begins with a string of 23 zero bits. The picture data in the macroblock data fields are encoded so that they never have such a long string or run of zero bits. Likewise, the headers do not have such a long run of zero bits. Thus the start code is unique within the video bitstream, allowing a bitstream decoder to easily detect the start code.
FIG. 3 is a flowchart for macroblock counting and decoding. For each VOP or frame, the MPEG decoder decodes one macroblock after another in order. The macroblocks do not themselves contain a macroblock number or identifier. Instead, a macroblock counter in the decoder is incremented for each macroblock processed. The macroblock counter thus keeps track of which macroblock is being decoded.
Each video packet header also contains a macroblock number for the next macroblock in the new packet. This header macroblock number can be compared to the macroblock counter as a check. The first video packet in a frame has a VOP header rather than a VP header. The macroblock counter is reset for each new VOP, so a macroblock number is not needed for the first packet of each frame.
A new video packet N is detected when a resync marker is found in the bitstream, step 72. The VP header that follows the resync marker is decoded, step 74. This header contains the header macroblock number, which is the number for the first macroblock in the new video packet N. When the video packet is the first video packet in a frame, the header macroblock number is implied to be one, the same value as the reset or initialized macroblock counter.
The decoder compares the header macroblock number that was just read from the VP header to the macroblock counter value, step 76. Most of the time, the values match. Decoding of all the macroblocks in the video packet can proceed, step 80. The macroblock counter is incremented for each macroblock processed.
When an error is detected when decoding the macroblocks in video packet N, step 82, then the decoder may conceal the error, step 84. Errors can be concealed by using pixels from a previous frame rather than the pixels, error terms, or motion vectors encoded with the macroblocks in the video packet.
When no decoding errors are detected, and all the macroblocks in packet N have been processed, then decoding can continue with the next video packet, step 72.
Occasionally, the header macroblock number does not match the macroblock counter, step 76. Some kind of error has occurred. When the previous video packet N-1 had a decoding error, step 78, then the error may have caused the macroblock counter to get off count. The header macroblock number is assumed to be correct and the macroblock counter wrong. The header macroblock number in the new video packet N is used to over-write or update the macroblock counter. The header macroblock number is thus loaded into the macroblock counter, step 88. The macroblock counter is then incremented for each macroblock decoded in the current video packet N, step 80.
When the previous packet N-1 did not have a detected decoding error, step 78, then the macroblock counter is probably correct. Perhaps the error is in the new VP header, causing the header macroblock number to be read incorrectly. The header macroblock number is ignored or discarded, while the macroblock counter is used without any update, step 79. The macroblock counter is then incremented for each macroblock decoded in the current video packet N, step 80.
Another error may be detected when decoding the macroblocks in video packet N, step 82. The decoder may conceal the error, step 84, by using pixels from a previous frame rather than the pixels, error terms, or motion vectors encoded with the macroblocks in the current video packet N.
FIG. 4A shows recovery from a bit error in the header macroblock number. When the bitstream is transmitted over a wireless network, some corruption of the data is possible. In this example, a bit error occurs in the bitstream in VP header 33, causing an incorrect header macroblock number to be read.
The VOP frame begins with start code 20 and VOP header 22. The macroblock counter is reset to 1 (or 0 if the first MB is considered to be MB#0) at the start of each new VOP. The macroblock counter is incremented for each macroblock in data field 24. The counter can be pre- or post-incremented with each macroblock, depending on the exact counter timing used and any pipelining. As the last macroblock (MB#52) in data field 24 is processed, the macroblock counter reads 52. The macroblock counter is then incremented to 53 before the next macroblock (MB#53) in data field 34 is processed.
The next resync marker 30 begins video packet #2. VP header 32 following resync marker 30 is decoded. VP header 32 contains the header macroblock number 53, since the first macroblock in this packet is MB#53. Since the macroblock counter and the header macroblock number match, no error is detected and data processing resumes with data field 34 in the second video packet. The macroblock counter is incremented for each macroblock in data field 34, from MB#53 to MB#80.
Resync marker 31 is detected for the third video packet. VP header 33 is decoded, and the new header macroblock number is read and compared to the macroblock counter value. The macroblock counter is pre-incremented from 80 to 81, ready for MB#81 as the next macroblock.
However, a bit error occurs in VP header 33. The header macroblock number is corrupted so that the wrong value is read. Although the MPEG encoder wrote the correct value 81 to VP header 33, the bit error caused a different number, such as 33 or 75, to be read. This wrong header macroblock number does not match the macroblock counter value of 81.
Since no error was detected in the previous video packet (#2), the macroblock counter is assumed to be correct and the header macroblock number is assumed to be wrong. The header macroblock number is ignored and the macroblock counter is used for identifying the macroblocks in data field 35. Data processing continues with macroblocks MB#81 to MB#99 in data field 25. The macroblock counter is incremented for each macroblock in data field 25.
Correctly guessing that the header macroblock number was wrong reduces the amount of lost data when a bitstream error occurs. Decoding of macroblocks can proceed without any error when the only error was the header macroblock number.
Unfortunately, an incorrect guess for other kinds of bit errors is more difficult to recover from. FIG. 4B shows an undetected bit error in the macroblock data that corrupts the macroblock counter. Although start code 20 and VOP header 22 for VOP #1 are detected and decoded, a bit error in macroblock data in data field 24 occurs.
Some of the early macroblocks MB#1, MB#2 . . . in data field 24 are properly decoded, but an error occurs before the last macroblock MB#52 in data field 24 is processed. This error is not detected, but the start of some of the macroblocks in data field 24 is not recognized. The decoder combines several macroblocks of data together into a single corrupted macroblock.
Since some of the macroblocks in data field 24 are not recognized, the macroblock counter is not incremented enough times. The macroblock counter is only incremented for recognized macroblocks. In this example, the macroblock counter is off by 5, reading 47 for the last macroblock MB#52 in data field 24.
Resync marker 30 for the second video packet (VP) is detected, and VP header 32 is decoded. A correct value of 53 is read from VP header 32 as the header macroblock number.
The macroblock counter is pre-incremented from 47 to 48, but does not match the header macroblock number of 53. An error is detected. However, since the bit error in data field 24 was not detected, the macroblock counter is assumed to be correct and the header macroblock number is assumed to be wrong and is ignored. This is a decoder mistake.
Macroblocks MB#53 to MB#80 in data field 34 are decoded, but as interpreted as macroblocks MB#48 to MB#75. The macroblock counter is incremented for each macroblock, but starts out the video packet with the incorrect value of 48.
Resync marker 31 is detected, and VP #3 header 33 is decoded. The header macroblock number of 81 is read from VP header 33, but the macroblock counter reads 76 once it is pre-incremented. Since no error was detected in the previous video packet #2, the decoder assumes that the macroblock counter is correct and the header macroblock number is wrong. This is an incorrect assumption by the decoder.
Macroblock MB#81 is interpreted by the decoder as MB#76, since the macroblock counter reads 76. Macroblocks MB#82 to MB#99 are interpreted incorrectly as macroblocks #77 to #94. The decoder finally recovers when the macroblock counter is reset the by start code for the next VOP #2.
What is desired is a bitstream decoder that more accurately responds to mis-matches in the macroblock number. A robust decoder is desired that can more quickly recover from bitstream errors. An MPEG-4 decoder that can recover from a corrupted bitstream and mis-matching macroblock number is desirable to minimize loss of picture data. An MPEG-4 decoder that can more accurately choose either the header macroblock number or the macroblock counter value when a mismatch occurs is desired.