This invention relates to a shifter stage for a variable-length digital code decoder.
A certain number of data storage and data transmission devices use data coding which produces variable-length digital codes. These codes are then stored or transmitted one after another without any special separator. On decoding, each code is recognized by a logical unit and one of the read operations performs a shift, on the input data, corresponding to the number of bits contained in the decoded code.
The role of shift registers, such as one which is the object of this invention, is to decode the variable length digital codes.
Techniques for transmitting and storing digitized pictures make it possible to significantly improve the quality of the final pictures obtained, as compared to analog transmission. The applications of these techniques can also therefore be multiplied.
However, direct transmission and storage of moving digitized pictures requires an extremely high bit rate which in practice calls for these pictures to be compressed and coded. The digitized pictures are therefore coded prior to transmission so as to reduce the amount of data that they represent, and decoded after transmission.
The coding and decoding techniques are of course crucial to the final picture quality obtained, and it became apparent that some standardization would be required to ensure compatibility between the different equipment using these techniques.
Accordingly, a group of experts (known as the Moving Picture Expert Group or "MPEG") drew up the ISO Standard 11172. This standard, often referred to as MPEG, defines coding and decoding conditions of moving pictures, possibly associated with a sound signal, which can be used for storing and recalling pictures from memory and transmitting them.
This MPEG standard can be used to store pictures on compact discs, interactive compact discs, magnetic tapes, and to transmit pictures over local area networks and telephone lines as well as to transmit TV pictures through the air. For a full, detailed description of the entire technique, the reader is invited to read the MPEG standards which are referenced below.
Compressing data according to the MPEG standard may follow several different procedures. Consecutive pictures are collected making up a group of images forming a sequence. A sequence is therefore subdivided into groups of images. Each image is divided into sections and each section is broken down into macro-blocks which constitute the base element used to apply movement compensation and to change, where necessary, the quantization scale.
The macro-blocks are formed from a 16.times.16 matrix of picture elements (pixels). Each macro-block is divided into six blocks, the first four blocks carrying a brightness signal, and the other two blocks a chrominance signal, respectively blue and red. Each of these six blocks is defined as an 8.times.8 matrix of picture elements (pixels). Given the analogies existing between the information contained in the different images in a given sequence and in order to reduce the quantity of information stored or transmitted, different types of image are defined within each sequence.
I pictures (Intra frames) are pictures which are coded as a still image and therefore without reference to another image.
P images (Predicated) are deduced starting from the I or P image previously reconstructed.
B images (Bi-directional flames) are deduced from two reconstructed images, one I and one P or two P, one just before and the other just after.
It should be stressed that the images in a sequence are transmitted in the order of decoding and not generally in the order in which they are presented at the time of acquisition or restitution.
The Discrete Cosine Transformation (DCT) is applied on the block level. This DCT transformation transforms the spatial blocks, defined as indicated above as an 8.times.8 matrix of pixels, into temporal blocks formed also as an 8.times.8 matrix, of spatial frequencies.
It has been found that in the 8.times.8 matrix of the temporal block, the continuous background coefficient (DC) placed in the upper left hand corner of the matrix is much more important in terms of the visual impression obtained than the other components corresponding to different frequencies.
More precisely, the higher the frequency, the less sensitive the eye is to it. This is why the levels of frequencies are quantized, especially since the frequencies are high. This quantization is ensured by an algorithm that is not imposed by the standard, and which could be a quantization and variable length coding (VLC) operation.
The matrix in the frequency domain obtained by the DCT transformation is next processed by a matrix called "quantization matrix" which is used to divide each of the terms of the matrix of the temporal domain by a value that is linked to its position, and which takes account of the fact that the weight of the different frequencies presented by these coefficients is variable.
After each value has been rounded to the closest integer value, this operation results in a large number of coefficients equal to zero.
It should be stressed that for the intra macro-blocks, the quantization value of the DC coefficient is constant, for example 8. The non-zero frequency coefficients are then coded according to zigzag type scanning with reference to a Huffman table, which gives a variable-length coded value to each of the coefficients of the matrix and reduces the volume. Preferably, the coefficients representing the continuous backgrounds are transmitted after quantization and, in addition, the quantization matrix is optimized, in such a way that the volume of data is under a predetermined level which corresponds to the maximum storage or transmission possibilities, without any serious reduction in the quality of information transmitted.
Type I frames are coded without use of the movement vector. Conversely, P and B frames use movement vectors, at least for certain macro-blocks which make up these frames, allowing coding efficiency to be increased and indicating from which part of the reference image(s) a particular macro-block of the considered frame must be deduced.
The search for the movement vector is the object of optimization at the time of coding, and the movement vector is itself coded by using the DPCM technique, which best exploits the existing correlation between the movement vectors of the different macro-blocks of a given image. They are finally the object of variable-length coding (VLC).
All the data concerning a coded sequence form the bit stream that is either recorded or transmitted. Such a bit stream begins with a sequence header containing a certain amount of information and parameters whose values are maintained throughout the sequence.
Likewise, the sequence is broken down into groups of frames, each of these groups is preceded by a group header and the data representing each frame are themselves preceded by a frame header.
The MPEG Standard technique for coding moving pictures includes such a technique and therefore requires the use of shifter stages, these being the object of this invention.
Shifter stages for decoding variable-length digital codes, decoding one code per clock cycle, have up until now complied with the device shown in FIG. 1. A memory block 1 acquires input data made up of a number of bits M from an upstream memory (not shown). Barrel shifter register 2, whose shift is commanded by adder 3, supplies logical unit 4 with a word whose length w has been previously defined as being equal to the maximum length of a variable-length code to be decoded, which, because of the shift value m+1 defined by adder 3, comes after the digital data presented by input memory element 1 that have already been decoded.
Logical unit 4 decodes the first identifiable code from the word it receives and sends it to memory unit 5. The units 4 and 5 form a finite state machine which can also be implemented using a Program Logic Array (PLA). Adder 3 cooperates with a memory block 6 in such a way that it receives from memory block 5 the length m of the code decoded in the preceding cycle, and from cumulative memory block 6 the previous cumulative value of all the lengths of decoded codes. Adder 3 then commands barrel shift register 2 such that, as indicated above, it performs a shift corresponding to the cumulative length of all codes already decoded by logical unit 4 since the last acquisition of input data by input memory block 1.
When accumulator 3 overflows (msb=1), it commands a new read operation by input memory block 1.
Clock signal 8 coordinates all these operations. Thus in this prior art device, the addition by adder 3 of the lengths of codes previously decoded, the shifting by barrel shift register 2 of the corresponding values, and the logic decoding processing by logical unit 4 must all be performed successively during the same clock cycle.
The total volume of data which must be processed is imposed by the standard that must be respected by the device in which the shifter stage that we have just described is included.
The speed of this processing depends on the production of the circuit and may be increased by increasing the number of gates on the produced circuit. This, however, would mean increasing the surface area of circuits and accepting higher power consumption.
The object of this invention is therefore to produce a shifter stage for decoding variable-length digital codes which, because of its structure, is faster in operation but avoids the use of a large number of gates.
A further object of the invention is to propose a shifter stage for decoding variable-length digital codes that is both efficient and reliable, requires a relatively small surface area of silicon, and consumes little electrical power.
To achieve this, the invention relates to a shifter stage for decoding variable-length digital codes decoding one code per clock cycle, and which reads input data arriving from a memory, supplies a decoding logical unit on each cycle with a word having the size of the longest variable-length code to be decoded, receives from the logical unit the number of bits of the code decoded on the preceding clock cycle, and performs a shift in the read data equal to the cumulative total of the lengths of codes already decoded since the last read of input data.
According to the invention, it comprises a first barrel shift register which reads the input data and performs a shift in the data read equal to the cumulative total of the lengths of codes decoded between the preceding cycle and the start of the last read, and a second barrel shift register which receives data arriving from the first register and performs a shift equal to the length of the code decoded at the time of the preceding cycle.
According to different preferred embodiments, the device of the invention comprises the following characteristics taken in any technically feasible combination:
a memory block is interposed between the first barrel shift register and the second barrel shift register, the data supplied by the second barrel shift register directly supplying the logical unit; PA1 the data supplied by the first barrel shift register directly supply the second shift register, a memory block being interposed between the second barrel shift register and the logical unit; PA1 the second barrel shift register directly receives the length of the code decoded on the preceding cycle from the logical unit; PA1 it comprises an adder associated with a memory block which receives on each cycle the length of the code decoded, calculates the cumulative total of these lengths since the last read of input data, commands the shift of the first register and the reading of a new input of data; PA1 it is produced in the form of an integrated circuit by VLSI technology (very Large Scale Integrated technology); PA1 it is intended for decoding a video signal coded according to the MPEG standard by Discrete Cosine Transformation (DCT) and quantization; PA1 it is intended for decoding data recorded on an 5 interactive digital compact disc; PA1 it is intended for decoding data recorded on a magnetic tape; PA1 it is intended for decoding data transmitted by radio waves. PA1 applicable to all three standards; PA1 separation of the calculation of the length of the codes from the coding of the codes; this permits an increase in the speed of the arrangement by optimizing the element 16 by itself. PA1 in that the time allocated to the decoding of a code and to determination of next state of the apparatus is one complete clock cycle, which permits optimization of the silicon surface dedicated to this portion.
The disclosed innovative circuits are advantageously included in an innovative video codec chip, which can operate according to MPEG1, MPEG2, or H261 standards. Notable features of this chip include: