1. Field of the Invention
The present invention relates to the encoding/decoding technology of video data, and more particularly to a technology for reducing video signals in the pre-stage process of encoding.
2. Description of the Related Art
Since the amount of data of video data is overwhelmingly large compared with that of audio data or character data, in it process and data transfer, the compression/encoding of data is indispensable.
For this encoding, various methods are proposed.
For example, Patent document 1 (Japan Patent Application Laid-open No. 2007-53788) discloses a compressing/encoding method capable of calculating the degree of encoding difficulty of the pattern of each piece of image data and freely modifying the size of data after compression by appropriately compressing/encoding it with a compression ratio according to the calculated degree of difficulty.
Patent document 2 (Japan Patent Application Laid-open No. H10-164581) also discloses a technology for encoding image data in which image patterns are not uniform uniforming its image quality. In Patent document 2, calculate the degree of encoding difficulty and a weight co-efficient indicating the conspicuousness of image degradation for each macroblock (MB), calculating its degree of complexity on the basis of the encoding difficulty and the weight co-efficient and calculating the encoding quantization scale of each MB using the degree of complexity.
Furthermore, Patent document 3 (Japan Patent Application Laid-open No. H11-196437) discloses a motion detection method for detecting the motion of video signals between frames and between fields by a mixed signal obtained by adding its luminance and chrominance a certain ratio in order to improve the compression efficiency of image data.
Video data to be compressed is a 4:2:2 format signal in which a luminance (Y) has the number of pixels twice that of each chrominance (Cb, Cr, Pb, Pr or U, V) or a 4:4:4 format signal where the number of pixels of a luminance is the same as that of a chrominance are used. However, in most of the encoding of video encoding, video data is encoded after its format is converted to a 4:2:0 format signal or a 4:2:2 format signal in which the number of pixels of a chrominance is reduced by a sub-sample, for the reason that its total number is desired to reduce in the process in a pixel signal level in order to improve a compression ratio or that a luminance more easily recognizes its degradation from the viewpoint of a human visual characteristic.
Generally, an encoding method for reducing the number of processed pixels of a chrominance contributes to improve compression efficiency while maintain quality as much as possible.
FIG. 1 shows the simple concept of the video format.
A video is a group of a plurality of still images and consists of a plurality of frames. The frame memory of an encoding device stores a plurality of these frames.
In the case of a 4:2:2 format signal, when one piece of the still image frames is extracted, it is composed at the ratio of Y:Cb:Cr=4:2:2. As a specific example of this 4:2:2 format signal, there are ITU-R Rec. 709, ITU-R Rec. 656 or the like.
One piece of frame is composed of a plurality of MB and each MB is composed of four blocks of Y, one block of Cb and one block of Cr. One block is composed of 8×8 pixels.
Each still image of this 4:2:2 format signal is converted to a video format signal at the ratio of Y:Cb:Cr=4:1:1 called 4:2:0 format signal and is encoded. In this image encoding process, each still image is divided into sub-blocks called macroblock (MB) and is encoded for each MB. In this case, a video signal converted to a 4:2:0 format at the ratio of Y:Cb:Cr=4:1:1 is used.
FIGS. 2 and 3 show the configurations of the conventional encoding/decoding devices, respectively.
The encoding device 1a shown in FIG. 2 comprises a chrominance reduction/conversion unit 11, frame memory 12, a motion vector probe unit 13, a motion prediction unit 14, an orthogonal transform (T(DCT)) unit 15, a quantization (Q) unit 16, a variable length encoding (VLC) unit 17, an inverse quantization (IQ) unit 18, inverse orthogonal transform (IDCT) unit 19, an adder 20 and a subtractor 21.
The chrominance reduction/conversion unit 11 reduces the chrominance of an inputted video signal, for example, from a 4:2:2 format signal to a 4:2:0 format signal. The frame memory 12 mainly stores frame data in order to predict a motion and stores image data in the past and the future. The motion vector probe unit 13 reads an original image macroblock 22 and a reference block 23 from the frame memory 12 and calculates a motion vector being the amount of movement of the reference block 23 from the original image macroblock 22 on the basis of them. As the motion vector, the minimum predicted residual signal is selected on the basis of a certain criterion (absolute value sum or square-sum). The motion prediction unit 14 performs forward prediction, backward prediction and both-direction prediction on the basis of the reference frame in the frame memory 12 and the motion vector calculated by the motion vector probe unit 13 and generates a prediction frame.
The subtractor 15 subtracts the prediction frame calculated by the motion prediction unit 14 from the original image macroblock 22 and outputs the difference to the orthogonal transform unit 16. The orthogonal transform unit 16 applies a direct cosign transform (DCT) to a pixel whose motion is compensated for every 8×8 blocks. The quantization unit 17 quantizes a DCT transform co-efficient taking a visual characteristic into consideration. The variable length encoding unit 18 converts the quantization value to a variable length code, such as a Huffman code or the like, and outputs the code.
The inverse quantization unit 19 reversely converts the quantization value to the DCT transform co-efficient. The inverse orthogonal transform unit 20 reversely converts the DCT transform co-efficient calculated by the inverse quantization unit 18 to 8×8 blocks of pixel data. The adder 21 adds the prediction frame compensated for by the motion vector outputted from the motion prediction unit 14 to the pixel data of the differential value outputted from inverse orthogonal transform unit 20 and writes the pixel data to which distortion is added by compression into the memory frame 12 as a new reference frame.
In the encoding device 1a having such a configuration, when a video signal 11 is inputted, format conversion for reducing the chrominance of the video signal is applied to the video signal by the chrominance reduction/conversion unit and then is stored in the frame memory 12. Then, an image data compression process, such as MPEG or the like, is applied to the video signal and its code is outputted.
The video encoding device 1a, the enormous amount of information of an original signal is compressed by eliminating the redundancy in the time and spatial direction. More specifically, for the time direction, a motion compensation method for eliminating a difference with previous and subsequent frames using a motion vector is used and for the spatial direction, orthogonal transform for transforming the horizontal/vertical planes of a screen to frequency components, the representative value of orthogonal transform co-efficient obtained by quantization or the like is used. Data compression is also performed using a variable length encoding as an arithmetic information compression method.
In the conventional video encoding device 1a, compression efficiency has been improved by performing the resolution conversion of a chrominance using the chrominance reduction/conversion unit 11, for example, converting a 4:2:2 format input signal to a 4:2:0 format signal.
Next, a conventional decoding device 2a is described.
FIG. 3 shows a configuration example of the conventional decoding device 2a. 
The decoding device 2a shown in FIG. 3 comprises a variable decoding (VLD) unit 31, an inverse quantization (IQ) unit 32, an inverse orthogonal transform (IDCT) unit 33, a motion compensation unit 34, an adder 35, frame memory 36 and a chrominance extension/conversion unit 37.
The variable decoding unit 31 converts the variable length code, such as a Huffman code or the like, to a quantum value. The inverse quantization unit 32 reversely converts a quantization value to a DCT transform co-efficient. The inverse orthogonal transform unit 33 reversely converts the DCT transform co-efficient calculated by the inverse quantization unit 32 to 8×8 blocks of pixel data. However, although in the case of an I picture, the pixel data calculated here is actual pixel data itself, in the case of a P or B picture, it is a differential value between two pieces of pixel data. The motion compensation unit 34 calculates a block compensated for by a motion vector used in the encoding device 1a. The adder 35 adds a differential value outputted from the inverse quantization unit 32 and the block compensated for by the motion vector outputted from the motion compensation unit 34 to calculate a P or B picture. The frame memory 36 stores the pixel data of the calculated frame. The chrominance extension/conversion unit 37 converts, for example, a 4:2:0 format signal whose chrominance is reduced by the encoding device 1a to a 4:2:2: format signal by compensating for the chrominance as a post-treatment.
In the conventional decoding device 2a, since the reverse of the encoding process of the encoding device 1a is performed, many of its components are the same as those included in the encoding device 1a. In the decoding device 2a, a differential value is calculated by the reverse of the encoding process of the encoding device 1a and in the motion compensation unit 34 it is decoded using a motion vector determined by the encoding device 1a. In the chrominance extension/conversion unit 37, an extension process is applied to the code to which the chrominance reduction/conversion is applied at the pre-stage of the encoding process to transmit it as a video signal.
However, in a video signal to encode/decode, a luminace is not always recognized more easily and depending on a scene, a chrominance sometimes is more characteristic than the luminance.
In such a scene, the degradation of a chrominance becomes conspicuous when encoded. Nevertheless, in a video encoding, when a 4:2:2 format signal is inputted, it is often converted to a 4:2:0 format signal.
Recently, in home appliances, such as an HDTV-compatible TV, movie and the like, a chrominance has been more richly expressed and color depth can be expressed in more detail. From such point of view too, it can be said that a pro-chrominance component process has been widely spread.