The present invention relates to an image decoding apparatus and an image coding apparatus and, more particularly, to those using hardware circuits which realize high-speed decoding and coding in object units.
As the digitization of image data goes forward, apparatuses for compressive coding, transmission, and extensive decoding of image data have been put to practical use. The MPEG2, a global standard, is generally used as a compressive coding method, and various kinds of LSI apparatuses have been released as apparatuses performing coding or decoding adaptable to the MPEG2.
Hereinafter, an example of a conventional image decoding apparatus performing the MPEG2 decoding will be described with reference to the drawings (refer to xe2x80x9cMPEG AV Decoder LSI for Digital Broadcastingxe2x80x9d, Hirotoshi Uehara, Shoichi Goto, et al., Matsushita Technical Journal Vol. 45, No. 2, April 1999, pp. 17-24).
FIG. 9 is a block diagram illustrating the construction of a decoding LSI 800 which is a decoding apparatus adaptable to the MPEG2.
The decoding LSI 800 includes a setup processor 801 for controlling the respective components of the decoding LSI 800; a stream IF 809 for receiving a bitstream obtained by subjecting digital image data to MPEG2 coding; a variable-length decoding engine 802 for subjecting the bitstream to variable-length decoding; and an IDCT engine 803 for subjecting frequency-domain image data obtained by the variable-length decoding to inverse discrete cosine transform (IDCT) to generate space-domain image data. The decoding LSI 800 generates reproduced image data on the basis of the space-domain image data and predictive image data.
The decoding LSI 800 further includes a motion compensation engine 804 for subjecting the reproduced image data to motion compensation to generate the above-mentioned predictive image data; a memory 806 for storing the bitstream, the space-domain image data, the predictive image data, and the reproduced image data; a memory controller 805 for controlling access to data stored in the memory 806; a video IF 808 for outputting the reproduced image data to a display unit (not shown); and an I/O control processor 807 for controlling the video IF 808 on the basis of a control signal from the memory controller 805.
The variable-length decoding engine 802, the IDCT engine 803, and the motion compensation engine 804 are respectively constituted by hardware circuits.
Next, the operation of the decoding LSI 800 will be described.
When a bitstream obtained by subjecting digital image data to MPEG2 compressive coding is input to the stream IF 809, the bitstream is stored in the memory 806 through the memory controller 805.
In the setup processor 801, the header of the bitstream stored in the memory 806 is detected, and decoding on this part is started. Although this decoding on the bitstream is performed according to the MPEG2 decoding procedure, the setup processor 801 basically performs decoding on the header and general control as a sequencer.
Decoding on the part following the header of the bitstream is sequentially performed by the variable-length decoding engine 802, the IDCT engine 803, and the motion compensation engine 804. The result of the decoding, i.e., reproduced image data, is temporarily stored in the memory 806.
The video IF 808 reads the already-decoded image data (reproduced image data) from the memory 806 according to a display time, under control of the I/O control processor 807, and outputs it to the display unit.
The reason why variable-length decoding, IDCT, and motion compensation are performed by dedicated engines is because each of these processes is a fixed simple process with less branch-on condition and has considerable computational complexity.
Since the calculations with considerable arithmetic loads are performed by the dedicated engines and the respective engines are arranged so that the flow of data between these engines goes along the arithmetic processes in the decoding, a small-scale LSI capable of high-speed processing is realized.
Recently, the MPEG4 coding, which is suitable for low-bitrate transmission and is able to perform high-performance image processing, has been standardized.
The MPEG4 coding differs from the MPEG2 coding in that the conception of object coding is introduced in the MPEG4 coding. In the object coding, an image is divided into objects such as a foreground and a background, and compressive coding, data transmission, and extensive decoding are performed object by object, and decoded image data corresponding to the respective objects are composited for display. Data to be subjected to object coding are as follows: texture data indicating the luminance or chrominance of an image, corresponding to MPEG2 image data, and shape data indicating the shape of the image.
FIG. 10 is a diagram illustrating functional blocks for realizing an algorithm for decoding a bitstream which is obtained by compressively coding digital image data according to the MPEG4 coding.
In FIG. 10, reference numeral 900 denotes a decoding apparatus for decoding a bitstream including coded texture data and coded shape data. This decoding apparatus 900 includes a decoder 90 for decoding a bitstream corresponding to a foreground to output decoded texture data and decoded shape data; and a decoder 9 for decoding a bitstream corresponding to a background to output decoded texture data and decoded shape data. Further, reference numeral 92 denotes an image in a texture image space comprising the decoded texture data outputted from the decoder 90, and 93 denotes an image in a shape image space comprising the decoded shape data outputted from the decoder 90. Further, reference numeral 94 denotes an image in a texture image space comprising the decoded texture data outputted from the decoder 9, and 95 denotes an image in a shape image space comprising the decoded shape data outputted from the decoder 9.
The decoding apparatus 900 further includes a composition means 91 for generating composite image data corresponding to a composite image 96 which is obtained by superimposing the foreground on the background, on the basis of the decoded texture data and decoded shape data outputted from the respective decoders 90 and 9.
The decoding unit 90 further includes a variable-length decoding means 901 for subjecting the bitstream corresponding to the foreground to variable-length decoding, and outputting compressed texture data, compressed motion vector information, and arithmetically-coded shape data; and a motion vector decoding means 904 from decoding the compressed motion vector information to output a motion vector.
Further, the decoder 90 includes an inverse quantization means 902 for subjecting the compressed texture data to inverse quantization; an inverse DCT means 903 for subjecting the inversely-quantized data to inverse DCT to output space-domain texture data; and an addition means 911 for adding the space-domain texture data and predictive texture data to output decoded texture data. Furthermore, the decoder 90 includes a padding means 906 for padding the decoded texture data; a memory 907 for storing the output from the padding means 906; and a motion compensation means 905 for motion-compensating the padded texture data stored in the memory 907 on the basis of the motion vector to generate the above-mentioned predictive texture data.
Moreover, the decoder 90 includes a shape arithmetic decoding means 908 for subjecting the arithmetically-coded shape data to arithmetic decoding on the basis of predictive shape data to output decoded shape data; a memory 910 for storing the decoded shape data; and a motion compensation means 909 for motion-compensating the decoded shape data stored in the memory 910 on the basis of the motion vector to generate the above-mentioned predictive shape data.
The construction of the decoder 9 is identical to that of the decoder 90 and, therefore, does not require repeated description.
Next, the operation will be described.
When the bitstream corresponding to the foreground is input to the decoder 90 and the bitstream corresponding to the background is input to the decoder 9, the respective decoders 90 and 9 decode the coded texture data and the coded shape data included in the bitstream. Thereby, the decoder 90 outputs decoded texture data and decoded shape data corresponding to the foreground, and the decoder 9 outputs decoded texture data and decoded shape data corresponding to the background.
The composition means 91 composites the decoded texture data between the foreground and the background on the basis of the decoded shape data of the foreground and the background, and outputs the composite image data to the display unit (not shown), whereby the composite image 96 is displayed.
When performing composition of the foreground and the background, the shape information of these objects are required, and the decoded texture data of the respective objects as well as the decoded shape data are supplied to the composition means 91.
In order to perform the object-by-object decoding as described above, the shape information corresponding to the respective objects and the texture information corresponding to the non-rectangle object like the foreground are required. Therefore, the functional block (decoder 90) adaptable to the MPEG4 decoding requires the shape arithmetic decoding means 908 for performing arithmetic decoding to decode the shape information and the padding means 906 for performing padding on the decoded texture data corresponding to the foreground having an arbitrary shape to make the foreground have a rectangle shape, in addition to the inverse quantization means 902, the inverse DCT means 903, the motion vector decoding means 904, the texture motion compensation means 905, and the memory 907 which are also included in the functional block adaptable to the MPEG2 decoding.
However, the MPEG2 decoding LSI shown in FIG. 9 cannot efficiently perform the MPEG4 decoding shown in FIG. 10.
That is, although the variable-length decoding, inverse DCT, and motion compensation are common between the MPEG2 and the MPEG4, the MPEG2 functional block does not have means for performing the processes of padding, shape-decoding, and composition which are newly introduced into the MPEG4 processing.
It is thought that these processes may be implemented by a general-purpose processor like the setup processor 801 included in the MPEG2 decoding LSI. However, these processes include many branches performing different arithmetic operations depending on the conditions and, further, there occur many accesses to data in the processor. Therefore, high-speed decoding cannot be achieved by the decoding LSI. Furthermore, a decoding LSI having a high operation frequency is required to realize the MPEG4 decoding, and this increases the cost of the decoding apparatus.
The present invention is made to solve the above-described problems and has an object to provide an image decoding apparatus that can perform high-speed decoding on a bitstream corresponding to plural objects such as images, which are compressively coded according to the MPEG4 coding method, and that can minimize the cost of hardware circuits performing the decoding process.
Another object of the present invention is to provide an image coding apparatus that can perform high-speed MPEG4 coding on digital data corresponding to plural objects such as images, and that can minimize the cost of hardware circuits performing the coding process.
Other objects and advantages of the invention will become apparent from the detailed description that follows. The detailed description and specific embodiments described are provided only for illustration since various additions and modifications within the scope of the invention will be apparent to those of skill in the art from the detailed description.
According to a first aspect of the present invention, there is provided an image decoding apparatus for decoding coded image data which includes coded texture data obtained by coding texture data expressing the luminance or chrominance of an image, and coded shape data obtained by coding shape data expressing the shape of the image, thereby generating decoded image data including decoded texture data and decoded shape data. This apparatus comprises arithmetic decoding means for subjecting the coded shape data to arithmetic decoding to output the decoded shape data; padding means for padding the pixel values of pixels positioned outside a target image to be decoded, in an image space including the target image, which image space is constituted by the decoded texture data; composition means for compositing the decoded texture data of the target image and texture data of another image; at least one of the arithmetic decoding means, the padding means, and the composition means being constituted by a hardware circuit; and a processor for controlling the hardware circuit.
According to a second aspect of the present invention, there is provided an image decoding apparatus including an arithmetic decoding means which comprises a hardware circuit, performs arithmetic decoding on coded shape data obtained by performing arithmetic coding on shape data expressing the shape of an image, and outputs decoded shape data. The hardware circuit comprises a probability calculator for calculating the probability that a target pixel to be subjected to arithmetic decoding has a predetermined pixel value, in a shape image space corresponding to the shape data, on the basis of the pixel values of plural pixels which have already been subjected to arithmetic decoding; an arithmetic decoder for calculating the pixel value of the target pixel on the basis of the coded shape data, and the probability of the target image which is output from the probability calculator; and a data output unit for outputting the pixel values outputted from the arithmetic decoder, for every predetermined number of pixels at the same time.
According to a third aspect of the present invention, in the image decoding apparatus of the second aspect, the hardware circuit constituting the arithmetic decoding means allows parallel processing among calculation of probability by the probability calculator, calculation of pixel values by the arithmetic decoder, and output of pixel values by the data output unit.
According to a fourth aspect of the present invention, in the image decoding apparatus of the second aspect, the hardware circuit constituting the arithmetic decoding means performs calculation of probability by the probability calculator, calculation of pixel values by the arithmetic decoder, and output of the pixel values by the data output unit, for every predetermined number of pixels.
According to a fifth aspect of the present invention, in the image decoding apparatus of the second aspect, the data output unit has a data storage for storing the pixel values outputted from the arithmetic decoder, for every predetermined number of pixels as a unit, and the unit of pixels to be stored in the data storage is equivalent to the data width which is the number of data to be parallel-accessed to a processor controlling the hardware circuit or a memory storing the coded shape data and the decoded shape data.
According to a sixth aspect of the present invention, in the image decoding apparatus of the second aspect, the data output unit has a data storage for storing the pixel values outputted from the arithmetic decoder, for every predetermined number of pixels as a unit, and the unit of pixels to be stored in the data storage is a multiple of the number of pixels in one pixel line in a rectangle image space comprising a predetermined number of pixels as a unit of the arithmetic decoding.
According to a seventh aspect of the present invention in the image decoding apparatus of the second aspect, the data output unit comprises a data storage for storing the pixel values outputted from the arithmetic decoder, for every predetermined number of pixels as a unit of storage; and a shape information decision circuit for deciding whether or not the pixel values constituting the unit of storage are pixels outside the image.
According to an eighth aspect of the present invention, in the image decoding apparatus of the first aspect, the processor decides a padding method according to the inputted shape data, and outputs information indicating the decided padding method to the padding means, and the padding means performs padding on the basis of the decided padding method.
According to a ninth aspect of the present invention, in the image decoding apparatus of the first aspect, the padding means performs padding with, as a unit, a multiple of the number of pixels in one pixel line in a rectangle image space comprising a predetermined number of pixels as a unit of the arithmetic decoding.
According to a tenth aspect of the present invention, there is provided an image decoding apparatus for decoding coded image data which includes coded texture data obtained by coding texture data expressing the luminance or chrominance of an image, and coded shape data obtained by coding shape data expressing the shape of the image, thereby generating decoded image data including decoded texture data and decoded shape data. This apparatus includes padding means comprising a hardware circuit, for padding the pixel values of pixels positioned outside the image, in an image space comprising the decoded texture data. The hardware circuit constituting the padding means comprises a pointer controller for deciding whether each pixel is a pixel inside the image or a pixel outside the image in the image space, using the decoded shape data, and indicating pixels to be used for padding; an average calculator for calculating the average of the pixel values of the pixels indicated by the pointer controller; and a data processor for generating padding pixel values on the basis of the pixel values of the pixels indicated by the pointer controller, the average calculated by the average calculator, and the decoded shape data and decoded texture data, and padding the pixel values of pixels to be padded with the padding pixel values.
According to an eleventh aspect of the present invention, in the image decoding apparatus of the tenth aspect, the padding means allows parallel processing among designation of pixels by the pointer controller, average calculation by the average calculator, and padding of pixel values by the data processor.
According to a twelfth aspect of the present invention, in the image decoding apparatus of the tenth aspect, the padding means performs padding for every block comprising a predetermined number of pixels in the image space, and the padding means is provided with a memory for storing already-decoded pixel values which are required for padding of blocks to be processed after a target block which is currently subjected to padding.
According to a thirteenth aspect of the present invention, in the image decoding apparatus of the first aspect, the composition means receives decoded texture data corresponding to a target image to be decoded, decoded shape data corresponding to the target image, and texture data corresponding to another image to be used for composition, and composites the decoded texture data of the target image and the texture data of the other image on the basis of the decoded shape data, and outputs composite texture data.
According to a fourteenth aspect of the present invention, in the image decoding apparatus of the first aspect, the composition means composites, as a single unit, images to be displayed at the same time.
According to a fifteenth aspect of the present invention, there is provided an image coding apparatus for subjecting texture data expressing the luminance or chrominance of an image and shape data expressing the shape of the image to coding including object decoding, thereby outputting coded shape data and coded texture data and generating object decoded image data including object decoded shape data and object decoded texture data. This apparatus comprises arithmetic coding means for subjecting the shape data to arithmetic coding including object arithmetic decoding, thereby outputting the coded shape data and generating the object decoded shape data; padding means for padding the pixel values of pixels positioned outside the image, in an image space comprising the object decoded texture data; at least one of the arithmetic coding means and the padding means being constituted by a hardware circuit; and a processor for controlling the hardware circuit.
According to a sixteenth aspect of the present invention, there is provided an image coding apparatus for subjecting texture data expressing the luminance or chrominance of an image and shape data expressing the shape of the image to coding including object decoding, thereby outputting coded shape data and coded texture data and generating object decoded image data including object decoded shape data and object decoded texture data. This apparatus includes padding means comprising a hardware circuit, for padding the pixel values of pixels positioned outside the image, in an image space comprising the object decoded texture data. The hardware circuit constituting the padding means comprises a pointer controller for deciding whether each pixel is a pixel inside the image or a pixel outside the image by using the object decoded shape data, and indicating pixels to be used for padding; an average calculator for calculating the average of the pixel values of the pixels indicated by the pointer controller; and a data processor for generating padding pixel values on the basis of the pixel values of the pixels indicated by the pointer controller, the average calculated by the average calculator, and the object decoded shape data and object decoded texture data, and padding the pixel values of pixels to be padded with the padding pixel values.