1. Technical Field
The present invention generally relates to video transmission and, more particularly, to methods for controlling video frame stream at transmitting ends of video phones.
2. Discussion of Related Art
With the popularization of 3G network, more consumers favor 3G mobile phones with video functions. In addition to the existing audio experience, these kinds of 3G mobile phones can also provide a video experience for a user through real-time video images, which greatly enhances the user's enjoyment.
However, the existing video phone has a problem of video delay in call. The reasons of the video delay include: network delay, local delay on processing and transmitting video frames, etc. The advancement of 3G network will significantly improve The network delay. A major reason for local delay is that the captured video frame data needs to be compressed and multiplexed following 3G-324M H.223 multiplexing protocol before transmitting the data via a wireless network.
During a communication process of video phone, an application program at each terminal device is not only used as a transmitting end but also as a receiving end, which transmits video and audio data at this side to another end and at the same time receives and plays video and audio data from the other end. A sampling frequency for video frames at the transmitting end is set by the system. At present, the upper limit of the sampling frequency for video phone with 64 kb/s bandwidth is generally set around 15 frames per second. The video frame data captured by a camera is first compressed, and then the compressed AV data are multiplexed with the H.223 multiplexing protocol of 3G-324M into binary streaming data.
FIG. 5 illustrates an existing AV data processing process by an application program on a transmitting end of video phone.
When a calling of the video phone is connected, a handshake in the 3G-324M layer is first performed. After the handshake, if a negotiation between both ends of the video phones is successful, the transmitting end of the system will enter into the process as illustrated in FIG. 5. The process includes the steps of: capturing audio data by a microphone, compressing the captured audio data; capturing video data by a camera, compressing the captured video data; multiplexing audio and video data stream based on the H.223 protocol of 3G-324M 300; and transmitting the multiplexed data streams via a wireless network.
In the process described above, the audio data and video data are captured by their respective hardware devices, compressed in their respective formats according to the handshake negotiation mechanism, and then multiplexed into a single binary data stream with the sub-protocol H.223 of the 3G-324M. Finally, the multiplexed data stream is transmitted to the opposite end via a wireless network.
Process at the receiving end is a reversed process. The receiving end demultiplexes the received binary data stream with H.223 protocol, and divides it into video data and audio data. Then, the video data and audio data are respectively decompressed and transmitted to corresponding hardware for playing.
Because the existing video phone operates in a 64 kb/s bandwidth of circuit domain, the sampling frequency of video frame for the camera, set with an upper limit, will fluctuate in accordance with a load change of the system. Therefore, the video frames captured per unit time may vary. When the load of the system is high, the data captured from the camera often can not be compressed and multiplexed with H.223 in time. Consequently, a large amount of multiplexed data stays in the transmitting queue and cause a delay of video frames at the receiving end. As a result, the video frames and audio data are transmitted asynchronously so as to weaken the performance of the video phone.
The reasons of the video delay are briefly analyzed hereinafter.
Processing each video frame includes two steps: capturing video data and transmitting video data. Capturing video data further includes capturing video frames and compressing the video frames. In the step of capturing video data, the camera captures video frames at a predetermined interval, such as for example, in a case of 15 frames per second, time interval between each frame is about 0.067 sec. The captured video frames are then compressed to obtain the compressed video frame data. The compressed video frame data is transmitted via the wireless network.
Therefore, after data compression of one video frame is completed, the compressed video frame data is transmitted immediately. That is, under normal circumstances, when data compression of a first video frame is completed, the compressed first video frame data is transmitted; and when data compression of a second video frame is completed, transmitting of the first video frame data has already been completed, at which point the second video frame data can be transmitted, and so on.
With reference to FIG. 4, in each predetermined interval, the camera in turn captures a video frame F1, F2, F3, . . . , F8; compressing times for the corresponding video frames are C1, C2, C3, . . . , C8, respectively; and transmitting times for the corresponding video frames are S1, S2, S3, . . . , S8, respectively.
The transmitting time S1 for the first video frame F1 starts at a time point when the compressing time C1 for the first video frame F1 ends. The transmitting time S2 for the second video frame F2 starts at a time point when the compressing time C2 for the second video frame F2 ends. As shown in FIG. 4, under conditions of successive transmitting, the transmitting time S1 for the first video frame F1 ends prior to the time point when the compressing time C2 for the second video frame F2 ends, so that the transmitting time S2 for the second video frame F2 can start; and the transmitting time S2 for the second video frame F2 ends prior to the time point when the compressing time C3 for the third video frame F3 ends, so that the transmitting time S3 for the third video frame F3 can start. The first to sixth video frames F1-F6 have the same situation, as shown in FIG. 4.
However, at a time point that the compressing time C7 for the seventh video frame F7 ends, the transmitting time S6 for the sixth video frame F6 has not ended. Thus, the transmitting time S7 for the seventh video frame F7 can not start until the transmitting time S6 for the sixth video frame F6 ends, so that the transmitting for the seventh video frame is delayed. Similarly, at a time point that the compressing time C8 for the eighth video frame F8 ends, the transmitting time S7 for the seventh video frame F7 has not ended. Thus, the transmitting time S8 can not start until the transmitting time S7 for the seventh video frame F7 ends, so that transmitting for the eighth video frame is delayed.
It can be seen from the exemplary embodiment described above that, if transmission of the previous video frame data has not completed when compressing of the current video frame data completes, the current video frame data can not be transmitted until the transmitting for the previous video frame data is completed, which results in delay of video transmission. Furthermore, if a succession of frames are delayed, these delays will be accumulated to cause that the video transmitting seriously lag behind the audio transmitting, which weakens the use effect of the video phone.