The present invention relates to a fuzzy-controlled coding method and apparatus therefor, and more particularly, to a coding method and apparatus therefor for controlling a quantization step size which is determined by a state of a buffer for storing coded image data and a visual characteristic of the human eye, using a fuzzy control method.
Known methods are currently widely used in audio and/or video transceiving systems for digitally encoding an image video signal and/or audio signal into a digital signal, transmitting or recording the result, and decoding the received signal or the reproduced signal. In general, such methods include a transform coding method, a differential pulse code modulation method, a vector quantization method, and a variable length coding method. The kinds of methods commonly used operate to reduce the total data amount by removing redundant data from a digital image signal.
To perform the above-described coding methods, a screen is divided into predetermined-sized blocks. Then, transform operations are performed on each block or a difference signal between each block is generated, and the image data is converted into a transform coefficient in the frequency domain. Data transform methods for respective blocks include a discrete cosine transform (DCT) which is the most widely used, a Walsh-Hadamard transform (WHT) operation, a discrete Fourier transform (DFT) and a discrete sine transform (DST) operation.
When signal strength is concentrated into the low frequency domain, resulting from the above described-transform methods, transform coefficients are changed into representative values through a quantization process. Then, the representative values are variable-length-coded, considering the static characteristics of the representative values, to thereby compress the data.
However, the human eye is more sensitive to lower frequencies than to a higher frequencies. Therefore, the amount of transferred data can be further decreased by making the compression rate of image data in the high frequency domain relatively larger than that in the low frequency domain.
FIG. 1 is a schematic block diagram showing a conventional coding apparatus for a motion picture image.
Referring to FIG. 1, a block image data of N.times.N size (a reference block data) is applied to an input terminal 10. (Generally, an N1.times.N2 sized block is applied. However, it is assumed here that N1=N2.) Predetermined feedback data is applied to the "-" input terminal of a first adder A1 to be subtracted from the block image data which is applied to "+" input terminal in first adder A1. The result of the subtraction (the difference data) is applied to a discrete cosine transformer 11. In discrete cosine transformer 11, the difference data is converted into transform coefficients, namely, frequency domain data. The signal strength of the transform coefficients will be concentrated into the lower frequencies.
The output of DCT 11 is applied to a quantizer 12, where the transform coefficients are changed into representative values of regular level through a predetermined quantization process. That is, a quantization step size (QS) parameter which is output from a quantization controller 30 is provided to quantizer 12 to control the quantization of the transform coefficients depending on the data storing state of a buffer 14.
In quantization controller 30, the quantization step size (QS) is controlled so that an overflow or underflow of data will not occur in buffer 14, to thereby control the bit rate. The quantized data from quantization controller 30 is applied to a variable length coder 13, where the quantized data is compressed by performing a variable length coding operation on the quantized transform coefficients, considering the static characteristics of the quantized transform coefficients.
Typically, a variable length coder such as coder 13 uses Huffman coding or arithmetic coding. In the Huffman coding method, considering the probability distribution of a symbol of the input signal, a symbol having a high probability, i.e., a symbol whose generation rate is high, is converted into a short code, while a symbol having a low probability is converted into a long code. However, when there are many kinds of symbols and most of the symbols have extremely low probabilities, long code words are given to each low probability symbol by a Huffman coding algorithm. As a result, both the encoding and decoding become very complicated.
To overcome this problem, as shown in FIG. 2, the low probability symbols are grouped into one so as to be treated as a simple fixed length code like an escape sequence. Thus, the average code length increases a little more than the original Huffman code. As a result, efficiency is lowered but the complexity decreases greatly.
In an image signal coding process, after a transform coefficient is quantized, a run length coding is performed by scanning up to the highest frequency component according to a zigzag scan starting from a low frequency component, having noticed that most of the quantized coefficients are 0. The run length code can be represented as (RUN, LEVEL). Here, RUN is the number of 0's between the non-zero coefficients, and the non-zero level varies depending on the number of values possibly coming out of the quantization operation, for example, when the output of the quantization ranges from integers -255 to +225, LEVEL takes values of 1 to 255, and a negative level is expressed using a sign bit.
When (RUN, LEVEL) is regarded as a single symbol, if RUN or LEVEL is high, an escape sequence consists of 6 bits (for example) of escape code, 6 bits for expressing RUN (0-63), 8 bits for expressing LEVEL (1-255) and 1 bit for expressing sign, thus 21 bits (for example) in total, in an escape area having a statistically low generation rate of the symbol. Other methods than the escape sequence can be also used depending upon the employed system.
In buffer 14, compressed data irregularly output from variable length coder 13 is input and stored. A data storing state, i.e., a buffer fullness (BF) signal, is output from the first variable length encoder 13 in order to control the amount of data input and to prevent an overflow or an underflow in buffer 14. Data from the buffer is output at a regular speed to the transfer channel line.
Meanwhile, as a result of compressing the image data, the image motion occurring between successive image frames as well as the relation of the spatial domain of an image within a single frame is detected. The amount of data to be transferred or to be recorded is reduced according to the detected degree of motion. Generally, since there are a lot of similar points between successive images, the degree of motion is estimated by employing reference block or macro block units, to thereby calculate a motion vector (MV). When the image data is compensated using the motion vector, the transfer data can be further compressed since the difference signal derived between the successive images is extremely small.
During processing of an image frame which is performed by coding using only the relation of the spatial domain of an image in a frame, i.e., when an intraframe is processed, data of an intraframe is stored into a frame memory 17 of a local decoder 1 so as to be employed for detecting the relative motion of the next frame.
Local decoder 1 is explained in more detail in the following paragraphs.
In an inverse quantizer 15, data of the intraframe output from quantizer 12 is inversely quantized. Then, the result is inversely converted in an inverse discrete cosine transformer 16 and is converted into image data in the spatial domain.
In a motion predictor 18, data of the current frame which is processed as an interframe, i.e., a frame which uses the relation of adjacent frames, and data of the previous frame stored in frame memory 17, are input so as to detect the motion between the two frames.
A block matching method performed by a full search is used for detecting the motion in motion predictor 18. In the block matching method, a block which is most similar to a reference block is searched for within a limited search area, laying stress on a block of the previous frame located in the same location with that of the given N.times.N block (reference block) of the current frame. At this time, various criteria may be chosen for determining the degree of similarity. However, an estimated block is generally searched for by determining a mean absolute error (MAE).
Many candidate blocks within a search window are compared with the reference block, and the block having the most similarity is selected as the estimated block. A motion vector is the distance in coordinates that the estimated block moves relative to the reference block.
In a motion compensator 19, the block (an estimated block) which is most similar to the current reference block stored in frame memory 17 is extracted corresponding to the detected motion vector, and is output as a feedback signal applied to the "-" input of first adder (A1). In first adder (A1), estimated error data, i.e., the difference between the block data of the current frame and the feedback signal (the estimated block data moved by the degree of the motion vector in the search window) is derived and is discrete cosine transformed in discrete cosine transformer 11 and is coded and is transferred to the receiver.
Here, considering the structure of a frame, a result of motion compensator 19 is re-stored into frame memory 17 via a second adder (A2) so that interframe processing can be repeated.
A frame unit or a block unit refresh is needed for the output of motion compensator 19 accumulated error.
The motion vector (MV), produced by the motion predictor 18, is also variable-length-coded in a second variable length coder 20 and is transferred to the receiver together with the image signal which is coded into an additional information form so as to be usable by a decoding system.
In the conventional encoder described above, the quantization step size of the quantizer is determined by the amount of data stored in the buffer.
In addition to the above described method, U.S. Pat. No. 5,038,209 discloses a video encoder for determining a quantization step size using the complexity and simplicity of a picture and the data storing state of a buffer.
A data encoding method using quantization is a non-restoring coding method where the restored data does not precisely accord with the data before coding. Therefore, the quantization step size of a quantizer is an important factor affecting the quality of the restored image.
However, in the conventional methods, an algorithm for determining the quantization step size of the quantizer using the various factors which greatly affect the image quality is complicated and is very difficult to implement. Accordingly, a system using data encoding having a high compression rate like HD-TV may have an unstable image quality in the restored image.
U.S. Pat. No. 5,077,798 discloses a voice coding method and system for providing reproduced high quality voice despite using a high data compression rate. The system of the latter patent uses a fuzzy control technique and is based on vector quantization. The system expresses the distance between the code vector nearest an input voice and the adjacent vector as a membership function, and the aural signal is vector-quantized. However, a method for determining the quantization step size by a fuzzy control method in a scalar quantization, as a method of the present invention which will be described hereinafter, is not heretofore disclosed.