The statements in this section merely provide background information related to the present disclosure and may not constitute prior art.
Video communications enjoy an increasingly broad range of applications with the rapid development of the broadband networks. The video-conferencing and video-telephony are now becoming fundamental services over the Next Generation Network (NGN). Telecom operators in various countries pay much attention to this market opportunity. It can be anticipated that in the forthcoming years, the video communications services will become an important service growth point for telecom operators. A key point in developing such service is improving the end-to-end user experience—or so-called Quality of Experience. In the user experience, besides the Quality of Service (QoS) parameters of the network including packet loss, delay, jitter, R factor, etc., the Gamma nonlinearity issue caused by various elements, which leads to distortion of the luminance signal with respect to the video signal, is an important factor that may influence the final user experience. However, the method and technology for improving the end-to-end user experience at present mainly focus on the aspects such as assuring the network QoS and the pre-processing/post-processing relevant to the video compression encoding. With respect to the luminance distortion issue caused by the Gamma characteristics, neither sufficient attention is paid, nor has a systematic solution been developed. The seriousness of this issue is drawing attention of some large international telecom operators. France Telecom has recently put forward a suggestion to ITU-T that the influence of the Gamma characteristics upon the communications user experience should be considered in video communications, and such issue should be resolved.
In video communications, optical signal of the scenario (human beings, background, document, etc.) that needs to be transmitted enters a video recorder/camera in a video communications terminal (referred to as terminal hereinafter). The optical signal is transformed into a digital still image or video signal via A/D conversion and then compressed and encoded, and then transferred to a far-end terminal, where the digital still image or video signal is reconstructed again via decompression and displayed on a display device. Finally, the optical signal is to be perceived by human eyes. In this process, the luminance signal of the still image or video passes a plurality of elements. The luminance signal is a generalized luminance signal, i.e., signals in each phase, including the original optical signal, the electrical signal, and the digitized still image or video luminance/grey-scale signal, all contain information of the original luminance signal. Therefore, in a broad sense, it is the luminance signal that passes a plurality of elements.
As shown in FIG. 1, which is a schematic diagram showing a model of the Gamma characteristics of an element, the Gamma characteristics refer to that the input-output relation for the luminance signal of an element is not linear, i.e., there exists nonlinearity. The effect of the distortion caused by the Gamma nonlinearity of the element is as shown in FIG. 2. The luminance of the grey-scale blocks in the upper line increases linearly, i.e., from 0.1 to 1.0, whereas the luminance in the lower line is obtained after distortion of a Gamma nonlinear element, and the luminance increases according to a power function.
Practically, the Gamma nonlinearity may be caused by different reasons. For example, the Gamma characteristics of a Cathode Ray Tube (CRT) display device satisfy the following relation given by Equation 1 in an ideal situation:Lout=Lin2.2  (1)
Whereas the ideal Gamma of the corresponding video recorder/camera satisfy the relation given by Equation 2:Lout=Lin0.45  (2)
With respect to the origin of the Gamma issue, the Gamma issue is originated from the CRT display device, because the Gamma value of the CRT display device is 2.2. In order to compensate this nonlinearity, an artificial Gamma value of 0.45 is introduced in the video recorder. If there are only two Gamma elements in the system, i.e., the CRT display device and the video recorder, a complete Gamma correction may be realized. It should be noted that the input and output luminance signals are both normalized in their respective coordinate systems, i.e., 0≦Lout≦1, 0≦Lin≦1. For display device of other type, such as Liquid Crystal Display (LCD), either the form of the Gamma function may be different, or the parameters of the Gamma function may be different.
The model of the Gamma characteristics of a plurality of cascaded elements is as shown in FIG. 3. The total Gamma characteristics equal the composition of the Gamma function of each element, as given in Equation 3:GCT(·)=G(1)(·)∘G(2)(·)∘G(3)(·) . . . G(n-1)(·)∘G(n)(·)lout=GCT(lin)=G(n)(G(n-1)(G(n-2)( . . . G(2)(G(1)(lin)))))  (3)
where “∘” represents composition operation of functions. CT represents Cascaded Total, i.e., the total Gamma of the cascaded elements.
In an ideal situation, from the optical signal entering the camera to the optical signal being output and displayed on the display finally, there exists a linear relation between input and output luminance signals, i.e., Lout=Lin, thus the scene perceived is the same as the original one, and the user may have the best experience.
In order to obtain the linear relation, a Gamma correction should be performed for the element with nonlinear Gamma characteristics. As shown in FIG. 4, the Gamma characteristics are specified for an element, and a further correction element may be cascaded thereafter, so that the total Gamma characteristics after cascading become a real linear relation, and the object of compensating the nonlinearity of the specified element may be achieved. The model of the correction element is the inverse model of the equivalent model of Gamma characteristics. If the equivalent model can be represented with a function, the function of the inverse model is the inverse function thereof. Obviously, Gg(·) and Gc(·) are mutually inverse functions. Normally, for a function, the inverse function thereof does not always exist (or even if the inverse function exists, the inverse function may not be obtained in a closed-form).
As shown in FIG. 5, in practical applications, the correction element usually needs to be inserted between two specified elements. At this time, the situation is more complicated with respect to Gc(·), because there exists no simple inverse function relation between Gc(·) and Ga(·) or between Gc(·) and Gp(·).
In a video communication, there is a plurality of cascaded elements within a terminal, and each element has Gamma characteristics. At present, there is no general method for implementing Gamma correction for processes from an optical signal entering a video recorder/camera until a display device displaying still image or video. Therefore, no general solution is provided for the video quality degradation caused by the Gamma issue. Meanwhile, the Gamma characteristic parameters of different terminals are unknown to each other. Thus, the issue of how to implementing Gamma correction after a video transferred from terminal A to terminal B is still not resolved. In a multiparty video communication, the situation is even more complicated, because a multipoint control unit (MCU) is involved, the MCU implements mixing for videos from a plurality of terminals, and then the mixed video is sent to each terminal. The Gamma characteristics of various sub-pictures in a multi-picture image may be different, so it is even more difficult to implement Gamma correction.
The main functions of the MCU include multipoint control (MC) and multipoint processing (MP). The MC includes communications process control and meeting control, etc. The MP includes media processing, video bit streams forwarding or multi-picture video/image synthesizing, audio mixing, etc. The MP function is mainly related to the present disclosure, and in a stricter sense, the video processing function of MP is mainly related to what? With respect to the video, the MCU may operate in following modes, and the former two situations are also referred to as a video forwarding mode.
1. Meeting Place Free View Mode
In this mode, each terminal that participates in a meeting can select freely to view the video of the meeting places of any other terminal. The MCU is responsible for forwarding the video of the terminal to be viewed to the receiving terminals. The number of terminals that can select to view other meeting places is determined by the maximum number of the freely viewable meeting places that can be supported by the MCU, where the maximum number depends on the capability of the device or the configuration of the operation control system. For example, terminal A may select to view the video of the meeting place of terminal B, and terminal B may select to view the video of the meeting place of terminal C, and so on.
2. Meeting Place Designated Viewing (i.e., Meeting Place Broadcast) Mode
A chairperson terminal (if available) in the meeting or a meeting organizer designates via the operation control system that the video of a terminal meeting place should be viewed by all the terminals in the meeting. The MCU is responsible for broadcasting the terminal video to be viewed, i.e., the video of the designated meeting place. For example, when terminal X is selected, all the other terminals will view the video of the meeting place of terminal X.
3. Multi-Picture Mode
The MCU combines videos of a plurality of terminal meeting places into a single multi-picture video. The layout of the multi-picture is specified by the meeting chairperson terminal (if available) or meeting organizer via the operation control system, where the layout of the multi-picture includes the number of the meeting places, the layout of the images of these meeting places, and the relative sizes of the images. If the selected terminal meeting places are X1, X2, X3, . . . , XC, a possible layout is as shown in FIG. 6.
There are mainly two methods for synthesizing a multi-picture video by the MCU.
One method is a decoding-and-re-encoding synthesizing mode. The MCU first decompresses and decodes video bit-streams from various terminals, and restores the uncompressed digital video format. After that, the MCU assembles the images into a multi-picture video according to a specific multi-picture layout, and compresses and encodes the multi-picture video to obtain a new multi-picture video bit-stream.
The other method is a direct synthesizing method. The MCU synthesizes the video bit-streams from various terminals into a new bit-stream according to a grammar that complies with a standard. For example, in H.261, H.263 and H.264 protocols, such synthesization is allowed. In general, a problem that comes forth with the direct synthesization is that the terminal needs to decrease the resolution of the video image first. For example, the normal image is in a Common Interchange Format (CIF), the resolution needs to be decreased to ¼ of the original resolution, i.e., a quarter CIF (QCIF), so as to obtain a multi-picture with 4 sub-pictures (a layout of 2*2). In such situation, the image from the terminal is limited to be used for synthesization of the multi-picture video, and cannot be viewed by other terminals in a normal resolution. This limitation may not cause a serious problem in a specific application, and the advantages brought about by the direct synthesizing mode is apparent, i.e., the direct synthesization reduces the problem caused by decoding and recoding, such as requirement of high processing capability and deterioration of image quality, the cost of the MCU may be reduced, and the communication capacity and communication quality may be increased. Therefore, the direct synthesizing mode is widely employed.
Because the MCU needs to perform processing and computation, the number of terminals that can be controlled by an MCU is limited. In order to construct a larger communication network and to support more terminals, the manner as shown in FIG. 7 may be employed. In this manner, a plurality of MCUs are cascaded. For example, an MCU in the upper most layer controls MCU 2.1 to 2.m (totally m MCUs) in a lower second layer, and the MCUs in the lower second layer respectively control several MCUs (totally n MCUs) in the third layer. An MCU in a layer may control a number of terminals directly, or may control a number of terminals indirectly via an MCU in a lower layer that is controlled by the former MCU.
As shown in FIG. 8, the interior of an MCU may be decomposed as follows according to the function: a multipoint controller and a plurality of multipoint processors. Such decomposed model is very prevalent at present. In this way, implementation of the product may be more flexible, and more telecommunication devices of MCU type may be provided. The object of supporting the networking of a larger multimedia communication network may also be achieved by stacking a plurality of multipoint processors to enhance the processing capability.
The existing technologies are all based on the following hypotheses to perform Gamma correction on the terminal:
1. The display device and video recorder/camera of the terminal is designed and produced according to the standard requirements for the Gamma characteristics, i.e., the Gamma parameter of the display is 2.2, whereas the Gamma parameter of the video recorder/camera is 0.45.
2. There are no other Gamma elements between the video recorder/camera and the display device.
3. The data of the video bit-stream sent by the terminal is Gamma corrected, and such correction is implemented on the basis that the terminal can cooperate with the display device of a remote terminal.
Based on the above hypotheses, each terminal implements Gamma correction locally. The correction method is as shown in FIG. 4. The disadvantages of the existing correction method are apparent, because the three hypotheses required are even more difficult to be true at present. In the existing technologies, the high end video recorder is generally able to provide the Gamma correction function. However, much low-end camera cannot provide such function. If the video recorder can provide the Gamma correction function, it means that as a whole body, the Gamma characteristics of the video recorder as seen by an external device are given by Equation 2. However, in the practical situation, telecommunication operators are prompting public oriented video communications at present. Therefore, it is necessary to provide a very cheap terminal to attract the public. Thus, it is inevitable to employ cheap cameras. With respect to such cheap cameras, nonlinear Gamma characteristics may exist, but may not be in the form given by Equation 2, or even may not be in the form of a power function. According to practical test results, Gamma characteristics of a plurality of cheap cameras based on charge coupled device (CCD) are determined. The most approximate power function is Lout=Lin0.22, and a lot of data points deviate from this curve, so it is hard to say that the Gamma characteristics is the curve of a power function. Furthermore, it is quite possible that other Gamma elements exist in a terminal system. Therefore, even if a camera has the standard Gamma characteristics given by Equation 2, the effect of a complete Gamma correction may not be achieved.
If a Gamma correction is to be performed, the Gamma characteristic parameters of the Gamma element need to be used. A high end device may has ideal Gamma characteristics, so the power function as defined by Equation 1 or 2 may be employed, where the power function includes a form of pure power function and a form of power function with offset. However, the Gamma characteristics of most middle end devices or low end devices can only be presented with a look-up table (LUT) as shown in FIG. 1.
Because the range of definition and range of values of the Gamma function are both the unit interval [0, 1], a discretization manner may be employed to represent such function relation. As shown in Table 1, the table has a form of two columns and N rows, where the left column includes N discrete values of Lin, and the right column includes the corresponding N discrete values of Lout. Therefore, when the corresponding Lout is to be calculated according to the value of Lin, such calculation may be accomplished by looking up the table. If the value of Lin is not included in the left column, an interpolation method may be employed to calculate the corresponding value of Lout.
TABLE 1Representation method of Look-Up Table of Gamma parametersDiscrete value of Lin (input)Corresponding value of Lout (output)Lin (0)Lout (0)Lin (1)Lout (1)Lin (2)Lout (2). . .. . .Lin (N − 1)Lout (N − 1)
The two modes for representing the Gamma characteristic parameter have advantages and disadvantages respectively. The representation with function is concise, and only a few parameters need to be transferred. However, the calculation is complicated, and especially the calculation of a non-integer power of a floating point number may cost a lot of time. When the representation with a look-up table is employed, the calculation is simple, and this representation mode can be fit for any function form and have a good universality. However, the number of the parameters that need to be transferred is relatively large.
The ideal situation can hardly be obtained. Therefore, when the Gamma characteristics are represented in the form of pure power function or power function with an offset, the representation may be not precise in some situations. For example, if the camera is a cheap camera, the Gamma characteristics of the camera may not be in the form of power function. In this case, the representation with function is invalid. At present, telecommunication operators are prompting public oriented video communications. Therefore, it is necessary to provide a very cheap terminal to attract the public. Thus, it is inevitable to employ cheap cameras.
In the first prior art, the representation with function is not precise enough, and the situation is too much simplified and idealized. In a situation in which the requirements for the precision of the Gamma correction are not high, the representation with function may be employed for the purpose of simplification. However, for some application scenarios where the requirements for the quality is high, piecewise function models given by Equation 4 and Equation 5 may be employed:
                              L          out                =                  {                                                                      0.45                  ⁢                                      L                                          i                      ⁢                                                                                          ⁢                      n                                                                                                                                        if                    ⁢                                                                                  ⁢                    0                                    ≤                                      L                                          i                      ⁢                                                                                          ⁢                      n                                                        ≤                  0.081                                                                                                                          1.099                    ⁢                                          L                                              i                        ⁢                                                                                                  ⁢                        n                                            0.45                                                        -                  0.099                                                                                                  if                    ⁢                                                                                  ⁢                    0.081                                    <                                      L                                          i                      ⁢                                                                                          ⁢                      n                                                        ≤                  1                                                                                        (        4        )                                          L          out                =                  {                                                                                          1                    0.45                                    ⁢                                      L                                          i                      ⁢                                                                                          ⁢                      n                                                                                                                                        if                    ⁢                                                                                  ⁢                    0                                    ≤                                      L                                          i                      ⁢                                                                                          ⁢                      n                                                        ≤                  0.081                                                                                                                          1                    1.099                                    ⁢                                                            (                                                                        L                                                      i                            ⁢                                                                                                                  ⁢                            n                                                                          +                        0.099                                            )                                        2.2                                                                                                                    if                    ⁢                                                                                  ⁢                    0.081                                    <                                      L                                          i                      ⁢                                                                                          ⁢                      n                                                        ≤                  1                                                                                        (        5        )            
Where Equation 4 is a representation of Gamma characteristics of a video recorder, and Equation 5 is a representation of Gamma characteristics of a CRT display. Although the representation of the piecewise function is relatively precise, the application range is narrow. Such representation is only fit for a high end device (so called broadcast-class device) and cannot be well used by a lot of middle end devices and low end devices, especially a camera.