1. Field of the Invention
This invention relates to video packetization in packet switched video telephone applications.
2. Background
A video slice typically starts with a resynchronization marker (RM) or slice header that can be used by the decoder to re-establish synchronization when errors are detected. In particular, RMs are generally only placed at macroblock (MB) boundaries, and cannot be placed arbitrarily within the video frame. Thus, adjusting the video slice size is difficult for the encoder so that one or more MBs fit exactly within a given packet.
One technique for a video slice alignment involves slice-level rate control to adjust the quantization step-size of MBs to adjust the slice length. Unfortunately, this technique adds significant complexity to the encoder design, and is not necessarily exact. Another technique involves encoding a slice until its size exceeds a predefined length, and then adding padding bytes between the end of the encoded video and the end of the slice. However, this approach undermines bandwidth efficiency.
Traditional methods make one video packet when the number of bits that have been generated exceeds a pre-defined size. Thus, there is a problem with having many small video packets that require additional RTP/UDP/IP/PPP overhead for transmission while providing unnecessary error protection over the last few bytes of video data in one frame. Even if a look-ahead approach can be used to estimate how many bits are left, traditional methods still cannot completely avoid small packets.
Re-encoding the bitstream will not solve the problem. Instead, re-encoding requires multiple runs of encoding to find the optimal solution that fits into the pre-defined size. In addition, re-encoding is not implementation friendly for mobile device applications.
Video packetization has been a challenging problem in packet switched (PS) video applications as it involves a trade-off of packetization efficiency and error resiliency. Sending a video packet requires 40 bytes of overhead including RTP/UDP/IP headers. If the packet is small, the overhead to data ratio will be high, and hence it is inefficient in terms of bandwidth. For example, if a packet is sent with 40-byte video data, then the overhead is 100%. In such a case, it is highly desirable for a video encoder to generate video packets with a sufficient amount of data to avoid inefficient packetization in terms of bandwidth.
In most video packetization schemes (hereinafter referred to as the “original packetization approach”), a pre-defined video packet size is usually specified and video encoder will try to generate all the video packets that are around this packet size. One way of doing so is to check if the video data that has been generated exceeds the pre-defined packet size. If yes, the video encoder will make a video slice by inserting a resynchronization marker (RM). However, the original packetization approach cannot guarantee all the video packets will have the pre-define size. Very often some very small size packets will be generated at the end of a video frame. That is, the original packetization approach cannot provide efficient packetization thus the bandwidth is often wasted while unnecessarily protecting the last few bytes. The results of using the original packetization approach using 120-bytes as a target packet size is shown in FIG. 1A. As can be readily seen, many small packets (shown in a circle for emphasis) are generated for the original packetization approach.
The pre-defined video packet size is usually specified and the encoder checks if the video data that has been generated exceeds the pre-defined packet size. Some heuristic approaches such as to check if the encoding reaches the last M (e.g. M=5) MBs can be used. If yes, the video encoder does not generate a video packet even if the amount of data has exceeded the pre-defined video packet size. Thus, the heuristic approach is not implementation friendly for mobile device applications and it also cannot achieve exact packet size. The results of using the heuristic packetization approach using 120-bytes as a target packet size is shown in FIG. 1B. As can be readily seen, many small packets (shown in a circle for emphasis) are generated for the heuristic approach.