1. Field of the Invention
The present invention relates to an image coding apparatus capable of controlling rates such that a coding amount distribution to a specific component within a frame is restricted to be equal to or larger than a designated lower limit value, or restricted to be equal to or smaller than a designated upper limit value.
2. Description of the Related Art
Presently, the still image coding algorithm “JPEG” has been widely popularized while being used on the Internet. On the other hand, various needs of further improvements in performance and of further adding of functions are made as coding systems in the next generation. Under these circumstances, the JPEG2000 project newly started in 1997 by the joint institution between ISO and ITU. In December of 2000, the major technical content as to the part 1 was finalized. The part 1 defined the basic system for the JPEG2000 algorithm. The basic system for the JPEG2000 coding algorithm will now be summarized in accordance with the recommendation (ISO/IEC 15444-1:2000).
That is, first, an input image signal is processed by two-dimensional wavelet transformation by a wavelet transforming unit so as to be band-split into a plurality of sub-bands. In this case, the two-dimensional wavelet transformation is realized by combining one-dimensional wavelet transformation with each other. In other words, this two-dimensional wavelet transformation may be realized by combining a process operation in which one-dimensional wavelet transformation of a vertical direction is sequentially performed for every column with another process operation in which one-dimensional wavelet transformation of a horizontal direction is sequentially performed for every line. In this case, one-dimensional wavelet transformation is arranged by a low-pass filter having a predetermined characteristic, a high-pass filter having a preselected characteristic, and a down sampler.
Two-dimensional wavelet transforming coefficients which are produced in the above-explained manner are expressed as LL, HL, LH, and HH, since a low band component is represented as “L”, a high band component is represented as “H”, transformation along the horizontal direction is expressed by a first character, and transformation along the vertical sub-scanning direction is expressed as a second character. These band-split components are called as sub-bands. In this case, as to the low band component (LL component) along the horizontal direction and the low band component (LL component) along the vertical direction, the wavelet transforming operations are carried out in a recursive manner. Times of the respective wavelet transforming operations executed in the recursive manner will be referred to as “resolution levels”, and these resolution levels are described in front of the two-dimensional wavelet transforming coefficients LL, HL, LH, and HH. That is to say, in such a case that a resolution time of wavelet transformation is 2, a resolution level of a minimum resolution component becomes 2, and to the contrary, resolution levels for maximum resolution components of HL, LH, and HH become 1.
Next, a wavelet transforming coefficient in the respective sub-bands is quantized based on a quantizing step size which has been set for each of the sub-bands.
Next, a wavelet transforming coefficient of each of the sub-bands, which has been quantized, is split into regions having fixed sizes, which are called as “code blocks”. Thereafter, the code blocks made of multi-value data are converted in binary bit plane representations, and then, each of the bit planes is split into three sorts of coding passes (in other words, Significant Propagation Decoding Pass, Magnitude Refinement Pass, and Cleanup Pass).
With respect to binary signals which are outputted from the three coding passes, a context modeling operation is carried out and an entropy coding operation is performed for each of these coding passes.
Also, a code amount and distortion information are calculated for each of the coding passes in each of the code blocks in conjunction with the entropy coding process operation.
Finally, while deteriorations (distortions) in image qualities are minimized by employing the Lagrange's method of multipliers, a rate control operation is executed by which a code amount is adjusted to become equal to or smaller than a target code size.
Since a method for realizing the rate control operation is not standardized, an arbitrary method may be employed in correspondence with an application program. A brief explanation is made of the mechanism of the rate control unit which is described in the recommendation (ISO/ITU 15444-1:2000) J.14.3 as reference information.
In this rate control method, when a cut down point in each code block “i” is assumed as “ni”; a code amount up to each cut down point is assumed as “R(i, ni)”; and a distortion up to each cut down point is assumed as “D(i, ni)”, a variable “λ” is adjusted by using the Lagrange's method of multipliers until it can satisfy such a condition that a total code amount “R” within an entire screen is located within a range of a target code amount “Rmax”, and the total code amount “R” is produced by a cut down point capable of maximizing the below-mentioned formula (1):Σ(R(i, ni)−λD(i, ni))   (1)
In this case, a distortion implies that how degree a mean square error of a reproduced image when code data up to a certain coding pass are transmitted is decreased, as compared with a mean square error of the reproduced image when the code data up to the certain coding pass are not transmitted. Strictly speaking, the distortion implies a decreased amount of the distortion. As a consequence, the distortion “D” is 0 before the coding operation is performed, and when the code data are coded up to the final bit plane, the distortion “D” becomes equal to the means square error.
A way of finding such a cut down point that the above-explained formula (1) becomes maximum is equivalent to such a way that when the code amount “R” and the distortion “D” of each of the code blocks are represented as an RD curve in a graph, such a cut down point that a slope of a tangential line becomes “λ−1” is found out. In a case where in two code blocks “c1” and “c2”, the cut down points by which the slope of the tangential line becomes “λ−1” are “nc1” and “nc2”, and the code amounts up to these cut down points are “R(c1, nc1)” and “R(c2, nc2)”, such “Rs” are added to each other with respect to all of these code blocks, and then, the added “R” is compared with “Rmax”. In a case where this is viewed for each of the code blocks, such a cut down point “ni” capable of maximizing (R(i, ni)−λD(i, ni)) is required to be found out.
Set ni = 0For k = 1, 2, 3, ...  Set  ΔR(i, k) = R(i, k) − R(i, ni)    and  ΔD(i, k) = D(i, k) − D(i, ni)  If (ΔD(i, k)/ΔR(i, k))>λ−1 then set ni = k
However, in this algorithm, if the above-explained process operation is not carried out with respect to a large number of the variables “λ”, then the cut down point “ni” cannot be acquired. To this end, a slope “S(i, k)=ΔD(i, k)/ΔR(i, k)” is previously corrected so that this slope is decreased in a monotone manner as to “k”. Concretely speaking, the below-mentioned process operation is carried out:
(1) Set Ni = {n} (i.e. the set of all truncation point)(2) Set p = 0(3) For k = 1, 2, 3, ..., kmax  If k belongs to Ni    Set ΔR(i, k) = R(i, k) − R(i, p),     and ΔD(i, k) = D(i, k) − D(i, p)    Set S(i, k) = ΔD(i, k)/ΔR(i, k)    If p≠0 and S(i, k)>S(i, p),     then remove p from Ni, and go to step (2)    Otherwise, set p = k
In accordance with the process operation, the cut down point with respect to the given variable “λ” may be optimized by the maximum “k” in Ni capable of S(i, k)>λ−1.
After the conducting operations for the above-explained plural information have been accomplished with respect to all of the code blocks, such a coding data which becomes the target code amount “Rmax” is formed. Concretely speaking, such a variable “λ” is found out which gives a maximum total code amount “Rsum” capable of satisfying “Rsum≦Rmax” with respect to a total code amount Rsum for a certain variable “λ”. In this case, a total code amount with respect to a certain variable “λ” may be grasped for the first time by that cut down points in the respective code blocks are exclusively acquired, and code data up to these cut down points are totalized. As a consequence, in order to find out the variable “λ” for giving the maximum “Rsum” capable of Rsum≦Rmax, normally, a total code amount for plural candidates as to the variable “λ” is calculated, and then, such a variable “λ” for giving a total code amount approximated to a desirable value is calculated by a convergence calculation. When the variable “λ” is obtained, code data up to the cut down point corresponding to this variable “λ” is collected from all of the code blocks; a coding pass number in each of the code blocks is furthermore added thereto as additional information; and then, final code data is constructed. As previously explained, the code data capable of minimizing the distortion can be produced based on the target code amount “Rmax”.
The above-described international standard specification of JPEG2000 can be acquired via the standardizing organization such as ISO and ITU-T. Also, the latest information as to JPEG2000 may be acquired by referring to the Internet address of http://www.jpeg.org.
As this sort of image coding apparatus, even in a case where a total code amount exceeds over a constant amount, such an image coding apparatus can uniformly distribute the code amounts to the respective blocks (refer to, for example, JP 09-252477 A).
Also, another image coding apparatus is capable of appropriately distributing a code amount for each of frames by solving a local shortage of code distribution amounts, since an additional code amount distribution is considered which does not depend upon a quantizing width (refer to, for instance, JP 10-108179 A).
Also, another image coding apparatus determines a code amount distribution by utilizing a frequency characteristic (refer to, for example, JP 2003-230162 A).
Another image coding apparatus calculates target information amounts of respective coding methods and distributes the calculated target information amounts so that information amounts produced by coding an I picture, a P picture, and a B picture may become optimum in view of image qualities in response to a visual characteristic of an image (refer to, for instance, JP 2005-303362 A).
Furthermore, another image coding apparatus is capable of performing a high coding process operation by effectively executing a subtraction process operation for calculating a slope of a code amount distribution (refer to, for example, JP 2005-109917 A).
In the above-described rate control method, since the codes are allocated in such a manner that the coding distortion becomes minimum over the entire screen under a certain total code amount, a major portion of the codes is concentrated to a specific component, depending upon an image. There are some possibilities that a certain sort of decoder is brought into malfunction. Conversely, a code amount of the specific component becomes very small, depending upon an image, and therefore, there are some possibilities that a subjective image quality is partially deteriorated, although such an optimum view point may be obtained, namely, the coding distribution may be minimized over the entire screen.
Also, in a case where the code amount of the specific component is restricted in the above-explained method, after the code amount distribution has been determined by executing the convergence calculation, if the code amount of the specific component is deviated from a range defined from the determined maximum value to the determined minimum value, then the code amount distribution must be again determined by executing the convergence calculation except for the specific component. In a case where plural sets of such specific components are provided, there is a problem that a total number of re-calculating the code amount distribution is further increased, and a calculation amount thereof is increased.