1. Field of the Invention
The present invention relates generally to a distributed coded video decoding apparatus and method capable of successively improving side information on the basis of the reliability of reconstructed data, and, more particularly, to a distributed coded video decoding apparatus and method capable of successively improving side information on the basis of the reliability of reconstructed data, which measure the reliability of the reconstructed data, determine whether the side information can be improved based on the reconstruction results, and update the side information depending on the determination, thereby successively improving rate-distortion performance.
2. Description of the Related Art
Since digital video data used in video conferencing, Video On Demand (VOD) receivers, digital broadcast receivers and Cable Television (CATV) generally is of a considerable data size, it is common to compress the data using an efficient compression method rather than using it in unchanged form.
Technologies for compressing video include compression standards such as MPEG and H.26×. These technologies have been used for many applications such as video players, VOD, video telephony and Digital Multimedia Broadcasting (DMB), and are being currently used for the transmission of video in a wireless mobile base due to the development of 2.5G/3G wireless communications.
Digital video data is compressed chiefly using three methods, that is, a method of reducing temporal redundancy, a method of reducing spatial redundancy and a method of reducing the statistical redundancy of occurring codes. A representative method of reducing temporal redundancy is the technology for motion estimation and compensation.
Although current coding technologies achieve high coding efficiency by eliminating such temporal redundancy, a reduction in the complexity of a coder in a limited resource environment like that of a sensor network has become an important technological issue because the portion of a video coder that generates the largest amount of computational load is also a motion tracking and compensation technology.
Distributed Source Coding (DSC) technology based on the Slepian-Wolf theorem is attracting attention as a method for solving the coder complexity problem. The Slepian-Wolf theorem mathematically proves that, if correlated sources are independently coded and decoded jointly, coding gain equal to that obtained by performing predictive coding on respective sources together can be achieved.
Distributed Video Coding (DVD) technology was established by extending distributed source coding technology, which was applied to lossless compression, to the case of lossy compressing, and is also based on Wyner-Ziv theory established by extending Slepian-Wolf theory, which is the theoretical basis of distributed source coding technology, to the case of lossy coding. From the point of view of video coding technology, these two technologies imply that it is possible to move all prior art motion estimation and compensation procedures performed so as to reduce inter-picture redundancy to a decoder side without the particular loss of coding gain.
Of distributed video coding technologies, Wyner-Ziv coding technology based on the paper “Wyner-Ziv coding for video: applications to compression and error resilience” published by A. Aaron, S. Rane, R. Zhang and B. Girod in Proc. IEEE Data Compression Conference, 2003 is well known. In this distributed video coding technology, the side information for a current picture is generated by a decoder using the similarity between neighboring pictures, this side information is configured in such a way that the noise of a virtual channel is added to a current picture to be reconstructed, and the current picture is reproduced by eliminating the noise from the side information using parity bits transmitted from a coder.
FIG. 1 is a diagram showing the construction of a coder 110 and a corresponding decoder 130 based on prior art Wyner-Ziv coding technology.
As shown in FIG. 1, the coder 110 based on the prior art Wyner-Ziv coding technology includes a key picture encoding unit 114, a quantization unit 111, a block segmentation unit 112, and a channel code encoding unit 113. The decoder 130 corresponding to the coder 110 includes a key picture decoding unit 133, a channel code decoding unit 131, a side information generation unit 134, and a video reconstruction unit 132.
The coder 110 based on Wyner-Ziv coding technology classifies pictures to be coded into two types. One type of picture is pictures to be coded using a distributed video coding method (hereinafter referred to as ‘WZ pictures’), and the other type of pictures are pictures to be coded using a prior art coding method rather than the distributed video coding method (hereinafter referred to as ‘key pictures’).
Key pictures are generally coded by the key picture encoding unit 114 using a method, such as an H.264/AVC intra-picture coding method, selected by a user, and are then transmitted to the decoder 130. The key picture decoding unit 133 of the decoder 130 corresponding to the coder 110 based on the prior art Wyner-Ziv coding technology reconstructs the key pictures that are coded using the predetermined method and then transmitted, and the side information generation unit 134 generates side information corresponding to WZ pictures using the key pictures that are reconstructed by the key picture decoding unit 133.
In general, the side information generation unit 134 generates side information corresponding to a WZ picture to be reconstructed using interpolation that assumes the presence of linear motion between key pictures disposed beside the WZ picture. Although extrapolation may be used in some cases, interpolation is used in most cases because interpolation is superior to extrapolation from the point of view of performance.
Meanwhile, in order to code a WZ picture, the quantization unit 111 of the coder 110 quantizes the WZ picture, and the block segmentation unit 112 divides the quantized WZ picture into predetermined decoding units. Furthermore, the channel encoding unit 113 generates parity bits for respective encoding units using channel codes.
The generated parity bits are stored in a parity buffer (not shown), and are then transmitted sequentially via a feedback channel, that is, a feedback channel, at the request of the decoder 130. The channel code decoding unit 131 of FIG. 1 estimates quantization symbols with reference to the side information and the parity bits transmitted by the coder 110. The video reconstruction unit 132 of FIG. 1 receives the quantization symbols estimated by the channel code decoding unit 131, inverse-quantizes the quantization symbols, and plays back the reconstructed WZ picture.
In the above process, ambiguity occurring during inverse quantization is handled with reference to the side information input by the side information generation unit 134.
Basically, the decoding method of Wyner-Ziv coding technology is to correct the noise of the side information using the channel code. However, since the encoder does not have channel information, it is hard to know the number of required parity bits to correct the noise, with the result that the decoder is configured to successively request parity bits from the encoder via the feedback channel. For a detailed description thereof, refer to the paper “Wyner-Ziv coding for video: applications to compression and error resilience,” published by A. Aaron, S. Rane, R. Zhang, and B. Girod in Proc. IEEE Data Compression Conference, 2003.
In the meantime, the feedback channel-based decoding method of the Wyner-Ziv coding technology has the advantage of enabling the update of side information using the results of each decoding process. For a detailed description thereof, refer to the paper “Motion compensated refinement for low complexity pixel based distributed video coding,” published by J. Ascenso, C. Brites, and F. Pereira in Proc. of IEEE International Conference on Advanced Video and Signal Based Surveillance, 2005 and the paper “Embedded side information refinement for pixel domain Wyner-Ziv video coding towards UMTS 3G application,” published by Z. Xue, K. K. Loo, and J. Cosmas in Proc. of IEEE Intern. Conf. on Multimedia and Expo, 2007.
However, the method of updating side information using reconstruction results is limited to the case where the reliability of reconstructed data is sufficiently high. If not, this method has a problem in that the quality of side information is continuously deteriorated.
In the Wyner-Ziv decoding process, since the most energy of a WZ picture is reconstructed through the decoding of the channel code in order to obtain reconstructed data with sufficiently high reliability, the reliability of the decoding results of the channel code must be sufficiently high. However, if a large number of channel code decoding errors have occurred because of a large amount of noise in the side information or lack of received parity bits, the reliability of data decoded by the channel code decoding unit (hereinafter referred to as the ‘channel code decoded data reliability’) and the reliability of video reconstructed by the video reconstruction unit (hereinafter referred to as the ‘reconstructed video reliability’) are considerably low. Accordingly, in the case where it is hard to generate accurate side information since key pictures have large numbers of quantization errors or the motion between pictures is complex or fast, the problem of the continuous deterioration of the quality of side information becomes more serious.
As a result, the technology for measuring the reliability of reconstructed data, determining whether side information can be improved based on the reconstruction results, and successively improving side information depending on the determination, thereby improving rate-distortion performance, is strongly needed.