This invention relates to video compression systems, and more particularly to error detection in macroblock motion vectors.
Delivery of video over wireless networks is receiving much interest as a key application for future wireless and handheld devices. For several years now, personal computers (PC""s) and various other computing devices have delivered video to users over the Internet. However, processing of video bitstreams or feeds is quite data-intensive. Limited communication-line bandwidth can reduce the quality of Internet video, which is often delivered in small on-screen windows with jerky movement.
To mitigate the problems of large video streams, various video-compression techniques have been deployed. Compression standards, such as those developed by the motion-picture-experts group (MPEG), have been widely adopted. These compression techniques are lossy techniques, since some of the picture information is discarded to increase the compression ratio. However, compression ratios of 99% or more have been achieved with minimal noticeable picture degradation.
Portable hand-held devices such as personal-digital-assistants and cellular telephones are widely seen today. Wireless services allow these devices to access data networks and even view portions of web pages. Currently the limited bandwidth of these wireless networks limits the web viewing experience to mostly text-based portions of web pages. However, future wireless networks are being planned that should have much higher data transmission rates, allowing graphics and even video to be transmitted to portable computing and communication devices.
Although proponents of these next-generation wireless networks believe that bandwidths will be high enough for high-quality video streams, the inventors realize that the actual data rates delivered by wireless networks can be significantly lower than theoretical maximum rates, and can vary with conditions and local interference. Due to its high data requirements, video is likely to be the most sensitive service to any reduced data rates. Interference can cause intermittent dropped data over the wireless networks. Errors in the bitstream are likely to be common.
Next-generation compression standards have been developed for transmitting video over such wireless networks. The MPEG-4 standard provides a more robust compression technique for transmission over wireless networks. Recovery can occur when parts of the MPEG-4 bitstream is corrupted. However, the MPEG standard does not specify exactly how to detect errors. Devices may differ in their ability to detect and correct bitstream errors.
FIG. 1 highlights video compression using a motion vector for a macroblock. When a video stream is compressed prior to transmission, each frame or video object plane (VOP) of the video stream is divided into rectangular regions known as macroblocks. Each macroblock is 16 by 16 pixels in size, so a 160xc3x97160 frame has 100 macroblocks.
While some macroblocks in some frames may be encoded simply by transmitting the 256 pixels in each macroblock, compression occurs when the same image in a macroblock can be found in 2 or more frames. Since video typically has 2 or more frames per second, movement of image objects is usually slow enough that similar images or macroblocks can be found in several successive frames, although with some movement or change. Rather than re-transmit all 256 pixels in a macroblock, only the changed pixels in the macroblock can be transmitted, along with a motion vector that indicates the movement of the macroblock from frame to frame. The amount of data in the bitstream is reduced since most of the macroblock""s pixels are not re-transmitted for each frame.
In FIG. 1, macroblock 16xe2x80x2 is a 16xc3x9716 pixel region of a first video object plane 10. All 256 pixels in macroblock 16xe2x80x2 are transmitted in the bitstream for first video object plane 10. In next video object plane 12, the same image as in macroblock 16xe2x80x2 appears, but in a different position in the frame. The same image in macroblock 16 in video object plane 12 is offset from the original location of macroblock 16xe2x80x2 in first video object plane 10. The amount and direction of the offset is known as motion vector 20.
Rather than transmit all 256 pixels in macroblock 16, motion vector 20 is encoded into the bitstream. Since one vector replaces 256 pixels, a significant amount of data compression occurs. The same image in macroblock 16 may also be found in successive video object planes, and motion vectors can be encoded for these video object planes, further increasing compression.
During compression, a search can be made of all pixels in first VOP 10 within a certain range of the position of macroblock 16. The closest match in first video object plane 10 is selected as macroblock 16xe2x80x2 and the difference in location is calculated as motion vector 20. When the image in macroblock 16 differs somewhat from the original image in original macroblock 16xe2x80x2, the differences can be encoded and transmitted, allowing macroblock 16 to be generated from original macroblock 16xe2x80x2.
The receiver that receives the encoded bitstream performs decoding rather than encoding. Motion vectors and error terms for each macroblock are extracted from the bitstream and used to move and adjust macroblocks from earlier video object planes in the bitstream. This decoding process is known as motion compensation since the movement of macroblocks is compensated for.
FIG. 2 shows that each macroblock can be divided into 4 smaller blocks. The MPEG-4 standard allows for a finer resolution of motion compensation. A 16xc3x9716 macroblock 16 can be further divided into 4 blocks 22, 23, 24, 25. Each block 22, 23, 24, 25 has 8xc3x978, or 64 pixels, which is one-quarter the size of macroblock 16.
FIG. 3 shows that separate motion vectors can be encoded for each of the 4 blocks in a macroblock. When the image in a macroblock remains intact, a single motion vector may be encoded for the entire macroblock. However, when the image itself changes, smaller size blocks can often better match the parts of the image.
A macroblock 16 contains four smaller images in blocks 22, 23, 24, 25. In current video object plane 12, these images occur within a single macroblock 16. However, in the previous or first video object plane 10, these images were separated and have moved by different amounts, so that the images merge together toward one another and now all fit within a single 16xc3x9716 pixel area of second video object plane 12. The images of blocks 22, 23, 24, 25 have become less fragmented in second video object plane 12.
During encoding, four motion vectors 26, 27, 28, 29 are separately generated for each of blocks 22, 23, 24, 25 respectively. This allows each block to move by a different amount, whereas when only one motion vector is used for all 4 blocks in a macroblock, all blocks must move by the same amount. In this example, block 25xe2x80x2 has shifted more to the left than other blocks 22xe2x80x2, 23xe2x80x2, 24xe2x80x2. Motion vector 29 is slightly larger than the other motion vectors 26, 27, 28. Better accuracy can be achieved when block-level motion vectors are used with a macroblock, at the expense of more data (four motion vectors instead of one). Of course, not all macroblocks need to be encoded with four motion vectors, and the encoder can decide when to use block-level motion compensation.
FIG. 4 is a flowchart of block- and macroblock-level motion compensation during decoding. The decoder parses the bitstream for each new macroblock, step 70. The number of motion vectors for the block is read, step 72. When only one motion vector is encoded for the macroblock, the pixels in the macroblock are fetched from memory that contains the original macroblock in the previous video object plane, step 74. Motion compensation is then performed, step 76, by shifting the x,y location of each of the 256 pixels in the original macroblock by the motion vector to determine the new pixel locations in the current video object plane. The next macroblock can then be parsed.
When four motion vectors are found in the macroblock, step 72, then the four 8xc3x978 blocks are fetched from memory that contains the pixels in the previous video object plane, step 78. Motion compensation is then separately performed on each of the 4 blocks, step 76. The x,y location of each of the 64 pixels in the original block is shifted by the motion vector for that block to determine the new pixel locations in the current video object plane. Each of the four blocks is shifted by its own motion vector. The next macroblock can then be parsed.
While such block-level motion compensation is useful, errors can still occur in the bitstream, especially when the bitstream is transmitted over a wireless network. What is desired is a method to detect errors in the bitstream. An intelligent error detector is desired that check the motion vectors.