The present invention relates to a code quantity control apparatus, a code quantity control method and so on, which are used in reception of picture information transmitted by way of a transmission means such as a satellite broadcasting system, a cable TV system or a network such as the Internet or used in the processing of picture information on a storage medium, such as an optical or magnetic disk, wherein the information has been compressed by orthogonal transformation, such as discrete cosine transformation, and by motion compensation, as is the case with information conforming to an MPEG (Moving Picture Experts Group) system. The present invention also relates to a picture information transformation method.
In recent years, picture information is handled as digital data. In this case, in order to allow the picture information to be transmitted and stored with a high degree of efficiency, there has been provided an apparatus conforming to a system such as an MPEG system for compressing the picture information by orthogonal transformation, such as discrete cosine transformation, and by motion compensation, which take the advantage of the existence of redundancy inherent in the picture information. Such an apparatus has been becoming popular as an apparatus used in both information distribution by a broadcasting station or the like and reception of information at an ordinary home.
In particular, an MPEG2 system (ISO/IEC 13818-2) is defined as a general picture encoding system. The MPEG2 system is a standard covering jump-scanned pictures, sequential-scanned pictures, standard resolution encoded pictures and high precision fine pictures. The MPEG2 system is expected to be used in the future in a broad range of applications including professional and consumer applications. In accordance with the MPEG2 compression system, a code quantity of 4 to 8 Mbps is allocated to a jump-scanned picture with a standard resolution of 720×480 pixels, and a code quantity of 18 to 22 Mbps is allocated to a jump-scanned picture with a high resolution of 1,920×1,088 pixels. Thus, it is possible to realize a high compression rate and a high picture quality.
However, the amount of picture information in high resolution pictures is very large so that, picture information is compressed by adopting an encoding compression system such as the MPEG system or the like, there is raised a problem that a code quantity of about 18 to 22 Mbps or even greater is required for a case in which the picture frame is a jump-scanned picture with a frequency of 30 Hz and a resolution of 1,920×1,080 pixels in order to obtain a sufficiently high picture quality. When transmitting picture information by way of network media, such as a satellite broadcasting system or a cable TV system, or when processing picture information on a storage medium, such as an optical disk or a magnetic disk, it is necessary to further reduce the quantity of code while reducing deterioration of the picture quality to a minimum. The quantity of code is reduced in accordance with the bandwidth of the transmission line of the network media or in accordance with the storage capacity of the storage medium. Such transmission and such processing are not limited to transmission and processing of a picture with a high resolution. That is to say, even in the case of transmission of a picture with a standard resolution by way of such network media and processing of the picture on such a storage medium, it is necessary to further reduce the quantity of code while reducing deterioration of the picture quality to a minimum. An example of a picture with a standard resolution is a jump-scanned picture with a frequency of 30 Hz and a resolution of 720×480 pixels.
As means for solving the problems described above, a hierarchical encoding technique (scalability technique) and a transcoding technique have been provided. In the MPEG2 standard, an SNR (Signal-to-Noise Ratio) scalability technique is standardized to allow high-SNR picture compressed information and low-SNR picture compressed information to be encoded hierarchically. In order to carry out a hierarchical encoding process, however, it is necessary to know the bandwidth or constraint conditions of a storage capacity at the time of encoding. For an actual system, however, the bandwidth or constraint conditions of a storage capacity are not known at the time of encoding in most cases. Thus, the SNR (Signal-to-Noise Ratio) scalability technique can be regarded as a means that is not appropriate for an actual system implementing an encoding process with a high degree of freedom, such as a picture information transformation method.
The configuration of a picture information transformation apparatus, which is referred to as a transcoder, basically includes a decoding unit and an encoding unit, which are connected to each other to form a parallel circuit. The decoding unit is used for carrying out a decoding process or a partial decoding process on input picture compressed information and the encoding unit is used for re-encoding data output by the decoding unit. The configuration of a transcoder can be classified into two conceivable categories. In the first category, pixel data is supplied from the decoding unit to the encoding unit through a pixel domain. In the second category, on the other hand, pixel data is supplied from the decoding unit to the encoding unit through a frequency domain. In the first category where pixel data is supplied from the decoding unit to the encoding unit through a pixel domain, the amount of processing is large. However, deterioration of a decoded picture of the compressed information can be suppressed to a minimum. The transcoder of the first category is mainly used in applications such as a broadcasting apparatus. In the second category where pixel data is supplied from the decoding unit to the encoding unit through a frequency domain, on the other hand, the picture quality deteriorates to a certain degree in comparison with the first category using a pixel domain. However, the second category can be implemented with only a small amount of processing. For this reason, the transcoder of the second category is mainly used in applications such as a consumer apparatus.
The following description explains the configuration of a picture information transformation apparatus, which transfers pixel data from the decoding unit to the encoding unit through a frequency domain, by referring to drawings.
As shown in FIG. 11, the picture information transformation apparatus 100 includes a code buffer 101, a compressed information analysis unit 102, an information buffer 103, a variable length decoding unit 104, an inverse quantization unit 105, an adder 106, a band limiting unit 107, a quantization unit 108, a code quantity control unit 109, a code buffer 110 and a variable length encoding unit 111. It should be noted that the picture information transformation apparatus 100 may also include a motion compensation error correction unit 112. In this case, however, the circuit scale becomes inevitably large even though the deterioration of the picture quality can be avoided.
The principle of operation of the picture information transformation apparatus 100 will now be explained.
Input picture compressed information having a large code quantity or a high bit rate is stored in the code buffer 101. The picture compressed information has been encoded so as to satisfy constraint conditions of a VBV (Video Buffering Verifier) prescribed by the MPEG2 standard. Neither overflow nor underflow occurs in the code buffer 101.
The picture compressed information stored in the code buffer 101 is then supplied to the compressed information analysis unit 102, which extracts information from the picture compressed information in accordance with a syntax prescribed by the MPEG2 standard. The following re-encoding process is carried out in accordance with the extracted information. In particular, information such as a quantization value (q_scale) for each macroblock and picture_coding type required in operations carried out by the code quantity control unit 109, which will be described later, is stored in the information buffer 103.
First, with regard to a direct current component of an intra macroblock, the variable length decoding unit 104 carries out a variable length decoding process on data encoded as a difference from an adjacent block and, with regard to other coefficients, the variable length decoding unit 104 carries out a variable length decoding process on data completing a run and level encoding process in order to produce quantized 1-dimensional discrete cosine transformation coefficients. Then, the variable length decoding unit 104 rearranges the quantized discrete cosine transformation coefficients obtained as a result of the decoding process into 2-dimensional data on the basis of information on a technique of scanning the picture. Typical scanning techniques include a zigzag scanning technique and an alternate scanning technique. The information on the scanning technique has been extracted by the compressed information analysis unit 102 from the input picture compressed information.
In the inverse quantization unit 105, the quantized discrete cosine transformation coefficients, which have been rearranged into 2-dimensional data as described above, are subjected to an inverse quantization process based on information on a quantization width (quantization scale) and information on a quantization matrix. These pieces of information have also been extracted by the compressed information analysis unit 102 from the input picture compressed information.
Discrete cosine transformation coefficients output by the inverse quantization unit 105 are supplied to the band limiting unit 107 for reducing the number of horizontal direction high band components for each block. 8×8 discrete cosine transformation coefficients output by the band limiting unit 107 are quantized by the quantization unit 108 at a quantization width (quantization scale) determined by the code quantity control unit 109 by adoption of a technique to be described later.
The principle of operation of the code quantity control unit 109 is explained as follows.
In accordance with a method adopted in MPEG2 Test Model 5 (ISO/IEC JTC1/SC29/WG11 N0400), the number of bits allocated to each picture in a GOP (Group of Pictures) is determined. This determination of the number of bits allocated to each picture in a GOP is referred to hereafter as stage 1. The determination of the number of bits allocated to each picture in a GOP is based on the number of bits to be allocated to unencoded pictures in the GOP. The unencoded pictures include pictures each serving as an object of bit allocation. A quantization scale is found by feedback control executed in macroblock units on the basis of the sizes of 3 types of virtual buffers set independently for each picture in order to make the number of bits allocated to each picture, which is found at stage 1, match an actual code quantity. The operation to find a quantization scale is referred to hereafter as stage 2. A quantization scale found at stage 2 is changed in accordance with an activity of each macroblock so as to result in finer quantization for the even portion in the picture, which easily allows deterioration to become visually striking, and coarser quantization for the complicated portion in the picture, which hardly allows deterioration to become visually striking. The operation to change quantization scale is referred to hereafter as stage 3. The MPEG2 picture information encoding apparatus put to practical use also executes code quantity control in accordance with an algorithm conforming to the method prescribed by Test Model 5.
If this method is adopted in the picture information transformation apparatus 100 like the one shown in FIG. 11 as it is, however, two problems will arise. The first one is a problem related to stage 1. Specifically, in the case of an MPEG2 picture information encoding apparatus, the GOP structure is given in advance so that the operation in stage 1 may be executed. In the case of the picture information transformation apparatus 100, on the other hand, the GOP structure is not known till a syntax analysis is carried out on the entire information of 1 GOP of the input picture compressed information. In addition, the length of a GOP is not necessarily fixed. In the case of an MPEG2 picture information encoding apparatus for practical use, a scene change may be detected and the length of a GOP is controlled in an adaptive manner in the picture compressed information.
The second problem is related to stage 3. Specifically, in the case of an MPEG2 picture information encoding apparatus, an activity is computed from luminance signal pixel values of the original picture. In the case of the picture information transformation apparatus 100, however, compressed information of an MPEG2 picture is input. Thus, since it is impossible to know luminance signal pixel values of the original picture, an activity cannot be calculated.
As a method to solve the first problem, a pseudo GOP is defined, and code quantity control is executed on the basis of the defined pseudo GOP. What is called a pseudo GOP includes I, P and B pictures. An I picture is a picture encoded in an encoding process based on information in 1 frame. A P picture is a picture encoded by forward directional prediction based on a plurality of previously encoded frames. A B picture is a picture encoded by bi-directional prediction based on previously encoded frames as well as frames to be encoded at later times. The length of a pseudo GOP varies in dependence on how a frame of picture compressed information is detected as an I picture.
Assume that a structure of the pseudo GOP determined as described above is {B1, B2, P1, B3, B4, I1, B5, B6, - - - , PL, BM−1 and BM}. In this case, the size L_pgop of the pseudo GOP is expressed by the following equation:L—pgop=1+L+M  (1)
For the pseudo GOP, target code quantities Ti, Tp and Tb of the I, P and B pictures are expressed by Eqs. (2), (3) and (4) respectively.
                              T          i                =                                                            K                p                            ⁢                              K                b                            ⁢                              X                ⁡                                  (                  I                  )                                                                                                      K                  p                                ⁢                                  K                  b                                ⁢                                  X                  ⁡                                      (                    I                    )                                                              +                                                K                  b                                ⁢                                                      ∑                                          i                      ⁢                                                                                          ⁢                      εΩ                                                        ⁢                                                                          ⁢                                      X                    ⁡                                          (                                              P                        i                                            )                                                                                  +                                                K                  p                                ⁢                                                      ∑                                          i                      ⁢                                                                                          ⁢                      εΩ                                                        ⁢                                                                          ⁢                                      X                    ⁡                                          (                                              B                        i                                            )                                                                                                    ×          R                                    (        2        )                                          T          p                =                                                            K                b                            ⁢                              X                ⁡                                  (                  P                  )                                                                                                      K                  p                                ⁢                                  K                  b                                ⁢                                  X                  ⁡                                      (                    I                    )                                                              +                                                K                  b                                ⁢                                                      ∑                                          i                      ⁢                                                                                          ⁢                      εΩ                                                        ⁢                                                                          ⁢                                      X                    ⁡                                          (                                              P                        i                                            )                                                                                  +                                                K                  p                                ⁢                                                      ∑                                          i                      ⁢                                                                                          ⁢                      εΩ                                                        ⁢                                                                          ⁢                                      X                    ⁡                                          (                                              B                        i                                            )                                                                                                    ×          R                                    (        3        )                                          T          b                =                                                            K                p                            ⁢                              X                ⁡                                  (                  B                  )                                                                                                      K                  p                                ⁢                                  K                  b                                ⁢                                  X                  ⁡                                      (                    I                    )                                                              +                                                K                  b                                ⁢                                                      ∑                                          i                      ⁢                                                                                          ⁢                      εΩ                                                        ⁢                                                                          ⁢                                      X                    ⁡                                          (                                              P                        i                                            )                                                                                  +                                                K                  p                                ⁢                                                      ∑                                          i                      ⁢                                                                                          ⁢                      εΩ                                                        ⁢                                                                          ⁢                                      X                    ⁡                                          (                                              B                        i                                            )                                                                                                    ×          R                                    (        4        )            where notations and denote an already encoded frame in the pseudo GOP and the frame in the pseudo GOP to be encoded respectively. Let notations F and B denote a frame rate and the code quantity of output picture compressed information respectively. In this case, Eqs. (5) and (6) are obtained as follows.
                              R          0                =                              B            F                    ×                      L            —                    ⁢          pgop                                    (        5        )                                R        =                              R            0                    -                                    ∑              xεΘ                        ⁢                                                  ⁢                                          generated                —                            ⁢                              bit                ⁡                                  (                  x                  )                                                                                        (        6        )            
Notation X (.) denotes a global complexity measure parameter representing the complexity of a frame. Let notations Q and S denote respectively the average quantization scale and the total code quantity of the frame, which are found in advance during a pre-parsing process carried out by the compressed information analysis unit 102 shown in FIG. 11. In this case, this global complexity measure parameter can be expressed by Eq. (7) as follows.X=S·Q  (7)
As prescribed by MPEG2 Test Model 5, notations Kp and Kb denote a ratio of the quantization scale of the P picture to the quantization scale of the I picture and a ratio of the quantization scale of the B picture to the quantization scale of the I picture respectively. With the ratios having values indicated by Eq. (8), the picture quality as a whole is assumed to always be optimized.Kp=1.0;Kb=1.4  (8)
Instead of using the values given in Eq. (8), as a conceivable alternative, the ratios Kp and Kb can also be computed dynamically from the complexity of each frame of input MPEG2 picture compressed information, as is described in the reference titled “Mathematical Analysis of MPEG Compression Capability and Its Application to Rate Control”, Jiro Katto and Mutsumi Ohta, (IE95-10, DSP95-10, April, 1995). To put it concretely, the values of the ratios Kp and Kb are also given by Eq. (9) in place of those given by Eq. (8).
                                                        K              p                        ⁡                          (                                                X                  ⁡                                      (                    I                    )                                                  ,                                  X                  ⁡                                      (                                          P                      i                                        )                                                              )                                =                                    (                                                X                  ⁡                                      (                    I                    )                                                                    X                  ⁡                                      (                                          P                      i                                        )                                                              )                                      1                              1                +                m                                                    ;                                            K              b                        ⁡                          (                                                X                  ⁡                                      (                    I                    )                                                  ,                                  X                  ⁡                                      (                                          B                      i                                        )                                                              )                                =                                    (                                                X                  ⁡                                      (                    I                    )                                                                    X                  ⁡                                      (                                          B                      i                                        )                                                              )                                      1                              1                +                m                                                                        (        9        )            
In accordance with the above reference, the expression 1/(1+m) is set at a value in the range 0.6 to 1.0 to give a good picture quality. In this case, Eqs. (2) to (4) can be rewritten into the following equations.
                              T          i                =                                            X              ⁡                              (                I                )                                                                    X                ⁡                                  (                  I                  )                                            +                                                ∑                                      i                    ⁢                                                                                  ⁢                    εΩ                                                  ⁢                                                                  ⁢                                  (                                                            1                                                                        K                          p                                                ⁡                                                  (                                                                                    X                              ⁡                                                              (                                I                                )                                                                                      ,                                                          X                              ⁡                                                              (                                                                  P                                  i                                                                )                                                                                                              )                                                                                      ·                                          X                      ⁡                                              (                                                  P                          i                                                )                                                                              )                                            +                                                ∑                                      i                    ⁢                                                                                  ⁢                    εΩ                                                  ⁢                                                                  ⁢                                  (                                                            1                                                                        K                          b                                                ⁡                                                  (                                                                                    X                              ⁡                                                              (                                I                                )                                                                                      ,                                                          X                              ⁡                                                              (                                                                  B                                  i                                                                )                                                                                                              )                                                                                      ·                                          X                      ⁡                                              (                                                  B                          i                                                )                                                                              )                                                              ×          R                                    (        10        )                                          T          p                =                                                            1                                                      K                    p                                    ⁡                                      (                                                                  X                        ⁡                                                  (                          I                          )                                                                    ,                                              X                        ⁡                                                  (                                                      P                            i                                                    )                                                                                      )                                                              ·                              X                ⁡                                  (                                      P                    i                                    )                                                                                    X                ⁡                                  (                  I                  )                                            +                                                ∑                                      i                    ⁢                                                                                  ⁢                    εΩ                                                  ⁢                                                                  ⁢                                  (                                                            1                                                                        K                          p                                                ⁡                                                  (                                                                                    X                              ⁡                                                              (                                I                                )                                                                                      ,                                                          X                              ⁡                                                              (                                                                  P                                  i                                                                )                                                                                                              )                                                                                      ·                                          X                      ⁡                                              (                                                  P                          i                                                )                                                                              )                                            +                                                ∑                                      i                    ⁢                                                                                  ⁢                    εΩ                                                  ⁢                                                                  ⁢                                  (                                                            1                                                                        K                          b                                                ⁡                                                  (                                                                                    X                              ⁡                                                              (                                I                                )                                                                                      ,                                                          X                              ⁡                                                              (                                                                  B                                  i                                                                )                                                                                                              )                                                                                      ·                                          X                      ⁡                                              (                                                  B                          i                                                )                                                                              )                                                              ×          R                                    (        11        )                                          T          b                =                                                            1                                                      K                    p                                    ⁡                                      (                                                                  X                        ⁡                                                  (                          I                          )                                                                    ,                                              X                        ⁡                                                  (                                                      B                            i                                                    )                                                                                      )                                                              ·                              X                ⁡                                  (                                      B                    i                                    )                                                                                    X                ⁡                                  (                  I                  )                                            +                                                ∑                                      i                    ⁢                                                                                  ⁢                    εΩ                                                  ⁢                                                                  ⁢                                  (                                                            1                                                                        K                          p                                                ⁡                                                  (                                                                                    X                              ⁡                                                              (                                I                                )                                                                                      ,                                                          X                              ⁡                                                              (                                                                  P                                  i                                                                )                                                                                                              )                                                                                      ·                                          X                      ⁡                                              (                                                  P                          i                                                )                                                                              )                                            +                                                ∑                                      i                    ⁢                                                                                  ⁢                    εΩ                                                  ⁢                                                                  ⁢                                  (                                                            1                                                                        K                          b                                                ⁡                                                  (                                                                                    X                              ⁡                                                              (                                I                                )                                                                                      ,                                                          X                              ⁡                                                              (                                                                  B                                  i                                                                )                                                                                                              )                                                                                      ·                                          X                      ⁡                                              (                                                  B                          i                                                )                                                                              )                                                              ×          R                                    (        12        )            
The following description explains a method for solving the second problem that it is impossible to compute activities because the luminance signal pixel values of the original picture are unknown.
The quantization scale Q of each macroblock in input picture compressed information is computed by using the luminance signal pixel values of the original picture in the encoding process. Since it is impossible to know the luminance signal pixel values of the original picture, the code quantity B and the quantization scale Q of each macroblock in the frame are extracted and stored in the information buffer 103 in the pre-parsing process carried out by the compressed information analysis unit 102 employed in the conventional picture information transformation apparatus shown in FIG. 11. At the same time, average values E(Q) of Q and E(B) of B or an average value E(QB) of their products are found in advance and stored in the information buffer 103.
The code quantity control unit 109 computes a normalized activity N_act in accordance with one of the following equations based on the values of the code quantity B and the quantization scale Q, which are stored in the information buffer 103.
                                          N            —                    ⁢          act                =                                            2              ⁢              Q                        +                          E              ⁡                              (                Q                )                                                          Q            +                          2              ⁢                              E                ⁡                                  (                  Q                  )                                                                                        (        13        )                                                      N            —                    ⁢          act                =                                            2              ⁢              QB                        +                                          E                ⁡                                  (                  Q                  )                                            ⁢                              E                ⁡                                  (                  B                  )                                                                          QB            +                          2              ⁢                              E                ⁡                                  (                  Q                  )                                            ⁢                              E                ⁡                                  (                  B                  )                                                                                        (        14        )                                                      N            —                    ⁢          act                =                                            2              ⁢              QB                        +                          E              ⁡                              (                QB                )                                                          QB            +                          2              ⁢                              E                ⁡                                  (                  QB                  )                                                                                        (        15        )            
Eqs. (14) and (15) each represent equivalent processing. If the picture quality is evaluated in terms of the SNR, Eq. (13) provides a better picture quality. However, Eq. (14) or (15) gives a better subjective picture quality.
By the way, assume that the quantization value (quantization scale) of a macroblock in input picture compressed information is Q1 and a quantization value computed in accordance with the above system for the macroblock in output picture compressed information is found in the code quantity control unit 109 to be Q2. Even though the picture information transformation apparatus 100 shown in FIG. 12 is intended to reduce the code quantity, the relation Q1>Q2 may hold true, indicating that the macroblock, which was once coarsely quantized is re-quantized more finely than the coarse quantization process. However, the amount of distortion caused by the coarse quantization process is not reduced by the finer re-quantization process. In addition, since more bits are allocated to this macroblock, the number of bits allocated to the other macroblock must be reduced, causing the picture quality to further deteriorate. Thus, for Q1>Q2, control is executed to make Q1=Q2.
By using Eq. (13), (14) or (15) given above, an activity can be computed.
As for stage 2 of the code quantity control executed by the code quantity control unit 109 employed in the picture information transformation apparatus 100 shown in FIG. 11, the same system as a system prescribed by MPEG2 Test Model 5 is adopted. The following description explains stage 2 prescribed in MPEG2 Test Model 5.
First of all, prior to a process of encoding a jth macroblock, the occupation sizes of the virtual buffer 212 for the I, P and B pictures are computed in accordance with Eqs. (16) to (18) respectively.
                              d          j          i                =                              d            o            i                    +                      B                          j              -              1                                -                                                    T                i                            ×                              (                                  j                  -                  1                                )                                                                    MB                —                            ⁢              cnt                                                          (        16        )                                          d          j          p                =                              d            o            p                    +                      B                          j              -              1                                -                                                    T                p                            ×                              (                                  j                  -                  1                                )                                                                    MB                —                            ⁢              cnt                                                          (        17        )                                          d          j          b                =                              d            o            b                    +                      B                          j              -              1                                -                                                    T                b                            ×                              (                                  j                  -                  1                                )                                                                    MB                —                            ⁢              cnt                                                          (        18        )            where notations Ti, Tp and Tb denote target code quantities for each frame of the I, P and B pictures respectively, notations d0i, d0p and d0b denote initial occupation sizes of the virtual buffer 212 for the I, P and B pictures respectively and notation MB_cnt denotes the number of macroblocks included in 1 frame. The target code quantities Ti, Tp and Tb for each frame of the I, P and B pictures are computed in accordance with Eqs. (2), (3) and (4) or Eqs. (10), (11) and (12) respectively. A virtual buffer occupation size dMB_cnti at the end of the process to encode the frame of an I picture is used as an initial occupation size d0i of the virtual buffer for a next I picture. By the same token, a virtual buffer occupation size dMB_cntp at the end of the process to encode the frame of a P picture is used as an initial occupation size d0p of the virtual buffer for a next P picture. In the same way, a virtual buffer occupation size dMB_cntb at the end of the process to encode the frame of a B picture is used as an initial occupation size d0b of the virtual buffer for a next B picture.
Then, a reference quantization scale Qj for a jth macroblock is computed in accordance with Eq. (19).
                              Q          j                =                                            d              j                        ×            31                    r                                    (        19        )            where notation r denotes a so-called reaction parameter for controlling a response speed of a feedback loop and is expressed by Eq. (20).
                    r        =                  2          ×                                                    bit                —                            ⁢              rate                                                      picture                —                            ⁢              rate                                                          (        20        )            
It should be noted that, at the beginning of the sequence, the initial occupation sizes d0i, d0p and d0b of the virtual buffer 212 for the I, P and B pictures respectively have values expressed by Eq. (21).
                                          d            0            i                    =                                    10              ×              r                        31                          ;                              d            0            o                    =                                    K              p                        ·                          d              0              i                                      ;                              d            0            b                    =                                    K              b                        ·                          d              0              b                                                          (        21        )            
Detailed configurations of the information buffer 103 and the code quantity control unit 109, which are employed in the picture information transformation apparatus 100, are shown in FIG. 12.
As shown in the figure, the information buffer 103 includes a code quantity buffer (frame buffer) 201, an average quantization scale computation unit 202, a quantization scale buffer 203, a code quantity buffer (macroblock buffer) 204, a picture type buffer 205, a complexity buffer 206, an average activity computation unit 207 and an activity buffer 208.
On the other hand, the code quantity control unit 109 includes a ring buffer 209, a GOP structure determination unit 210, a target code quantity computation unit 211, a virtual buffer 212 and an adaptive quantization unit 213.
The picture type buffer 205 and the code quantity buffer (frame buffer) 201 are used for storing respectively the picture type of a frame included in the MPEG2 picture compressed information input to the picture information transformation apparatus 100 and the quantity of code allocated to the frame. On the other hand, the quantization scale buffer 203 and the code quantity buffer (macroblock buffer) 204 are used for storing respectively the quantization scale of each macroblock of a frame and the quantity of code allocated to each macroblock of the frame.
The average quantization scale computation unit 202 computes an average quantization scale of the quantization scales of macroblocks included in a frame. As described above, the quantization scales are stored in the quantization scale buffer 203. The complexity buffer 206 is used for storing the complexity of a frame. The complexity of a frame is calculated from the average quantization scale of the frame computed by the average quantization scale computation unit 202 and the quantity of code allocated to the frame in accordance with Eq. (7). As described earlier, the code quantity for the frame is stored in the code quantity buffer (frame buffer) 201.
The activity buffer 208 is used for storing an activity of each macroblock in a frame. An activity of each macroblock in a frame is computed from the quantization scale of the macroblock stored in the quantization scale buffer 203 and the quantity of code allocated to the macroblock. As described earlier, the code quantity for the macroblock is stored in the code quantity buffer (macroblock buffer) 204. The average activity computation unit 207 computes an average activity of a frame from the activities of macroblocks included in the frames. As described earlier, the activities are stored in the activity buffer 208.
Information on the picture type of each frame in a GOP is transferred from the picture type buffer 205 to the ring buffer 209 employed in the code quantity control unit 109. The GOP structure determination unit 210 determines the structure of the GOP in output MPEG2 picture compressed information from information on the picture types of the GOP frames, which is stored in the ring buffer 209.
The target code quantity computation unit 211 computes a target code quantity of each frame of the output MPEG2 picture compressed information in accordance with Eqs. (2) to (4) or Eqs. (10) to (12) from the GOP structure of the output MPEG2 picture compressed. information and the complexity of the frame in the input MPEG2 picture compressed information. As described earlier, the GOP structure of the output MPEG2 picture compressed information is determined by the GOP structure determination unit 210, and the complexity of the frame in the input MPEG2 picture compressed information is stored in the complexity buffer 205. The occupation sizes of the virtual buffer 212 are updated in accordance with Eqs. (16) to (18) on the basis of the computed target code quantities.
The adaptive quantization unit 213 computes a quantization scale of a macroblock by using a reference quantization scale Qj of the macroblock and a normalized activity N_act computed in accordance with Eq. (13), (14) or (15). The reference quantization scale Qj of the macroblock is computed in accordance with Eq. (19) by using the occupation size of the virtual buffer 212. In the computation of the normalized activity N-act, the average activity for the frame and activities of macroblocks in the frame are used. The average activity for the frame is held by the average activity computation unit 207 and activities of macroblocks in the frame are stored in the activity buffer 208.
Feedback information obtained from the encoding process of the output MPEG2 picture compressed information is supplied to the target code quantity computation unit 211 and the virtual buffer 212.
FIG. 13 shows a flowchart representing processing carried out by the code quantity control unit 109. As shown in the figure, the flow begins with a step S100 at which a pseudo GOP is determined by pre-parsing as described above. Then, at the next step S101, a target code quantity of each frame is computed by using Eqs. (2) to (4). Subsequently, at the next step S102, code quantity control using the virtual buffer 212 is executed. The execution of the control of the code quantities corresponds to stage 2 of MPEG2 Test Model 5. Then, the flow of the processing goes on to a step S103 to carry out an adaptive quantization process based on an activity computed in a DCT domain by using Eq. (13), (14) or (15). Subsequently, at the next step S104, Q1 is compared with Q2. Q1 is a quantization value (quantization scale) in the input picture compressed information while Q2 is a quantization value in the output picture compressed information. If Q1 is found greater than Q2, Q1 is output. Otherwise, Q2 is output. By controlling the code quantity in this way, a good picture quality can be obtained.
By the way, in order to make the adaptive quantization for each macroblock effective, it is desirable to sustain the reference quantization scale Qj for a frame at pretty uniform values throughout the screen. Eqs. (16) to (18) are each used to compute the occupation size of the virtual buffer 212 by assuming that code (or bits) are allocated uniformly to macroblocks included in each frame. Since the picture actually varies from frame to frame, however, the reference quantization scale Qj also varies over the screen, causing block distortion.
In addition, Eq. (21) is equivalent to an equation setting the reference quantization scale of macroblocks included in the first I picture at 10. In dependence on the picture and the code quantity, however, the value of 10 may not necessarily be appropriate, causing the picture quality to deteriorate in some cases.