1 Field of the Invention
The present invention relates to a device for and method of coding/decoding image information by which the image transferred from an image input device is divided into object and background images having predetermined shape information, prior to an encoding/decoding and, more particularly, to a device for and method of coding/decoding image information which is derived to enhance coding/decoding and transmission efficiencies when coding image information by merging a plurality of boundary blocks, by performing a variable length coding according to the characteristic of each block transformed after the merge, and/or decoding the coded information.
2. Discussion of Related Art
As is generally known, image processing techniques, rather than using an entire comprehensive coding, typically divide an image of one frame into designated unit blocks having predetermined shape information, and subsequently the unit blocks are each processed by a compressive coding.
When a still picture is entered, the image is divided into object and background images for a later transmission. As for a moving picture, the variations of the object image are first transferred. By this process, natural or artificial images are composed and decomposed in the units of an object image unrestrictedly, enhancing the compressive coding and transmission efficiencies. An international standard that is based on the unit blocks having shape information has been established by the international organization for standardization (hereinafter, referred to as "ISO"), international telecommunication union telecommunication standardization sector (hereinafter, referred to as "ITU-T") and the like.
For example, ISO/IEC affiliated organizations are carrying out projects for a moving picture compression standardization; MPEG(Moving Picture Expert Group)-4 for moving picture compression standardization in WG11, JPEG(Joint Photographic Coding Experts Group)-2000 for a still picture compression standardization, and H.263+, H.320 and H.331 in ITU-T.
MPWG-4 is based on the concept of shape information and will be described below.
The concept of a VOP (Video Object Plane) is used in the MPEG-4 as a unit block having designated shaped information.
The VOP is defined as a tetragon that includes object and background images divided from an input picture.
The keynote of MPEG-4 lies in the fact that when a picture has a designated object or object region, the object image is divided into VOPs, each of which will be separately encoded, transmitted, or decoded.
The concept of a VOP is used in processing object images in the field of computer graphics and multimedia such as Internet multimedia, interactive video games, interpersonal communications, interactive storage media, multimedia mailing, wireless multimedia, networked database services using an ATM (Asynchronous Transfer Mode) network and the like, remote emergency systems, and remote video surveillance.
FIG. 1 is a block diagram of the VM (Verification Model) encoder 100 first decided by international standardization affiliated organization (ISO/IEC JTC1/SC29/WG11 MPEG96/N1172 JANUARY).
As shown in FIG. 1, a VOP definition block 110 divides a picture sequence, to be transmitted or stored, into a unit object image, and defines different VOPs.
FIG. 2 shows a VOP having a "cat" picture as an object image.
As shown in FIG. 2, the horizontal size of the VOP is defined as "VOP width", and the vertical size is "VOP height". Thus the defined VOP is then divided into (M.times.N) macro blocks consisting of M and N pixels along the X and Y axes. A grid starting point is framed at the left top of the VOP. For example, the VOP is divided into (16.times.16) macro blocks having 16 pixels along the X and Y axes respectively.
If the macro blocks formed in the right and bottom part of the VOP do not have M and N picture elements each along the X and Y axes, the VOP should be extended in size to contain M and N pixels respectively along the X and Y axes.
Both M and N are determined as even numerals so that an encoding can be performed in a texture coding sub block, as is described below.
FIGS. 3a-3b illustrates a VOP formed by extracting an object image (having a designated shape) from an input picture, and divided into unit macro blocks.
As shown in FIGS. 3a-3b, the macro blocks forming the VOP comprises regions with object image information and ones having no object image information.
Referring to FIG. 3a, the respective macro blocks are divided into interior macro blocks having object image information exterior macro blocks having no object image information, and boundary macro blocks partly including the image information. Prior to coding or decoding, the macro blocks are divided into the above-mentioned classes.
Referring to FIG. 3b, before a coding or decoding is performed, the boundary macro blocks are divided into, interior sub blocks having object image information, exterior sub blocks having no object image information, and boundary sub blocks partly having object image information.
The respective VOPs defined by the VOP definition block 110 are transferred into VOP coding blocks 120a, 120b, . . . , and 120n to perform a coding by VOPs. They are then multiplexed in a multiplexer 130 and transmitted as bit streams.
FIG. 4 is a block diagram of the VOP coding blocks 120a, 120b, . . . , and 120n of the VM encoder 100 as decided by international standardization affiliated organizations.
Referring to FIG. 4, a motion estimation block 121 receives the VOP concerning the respective object images in order to estimate motion information in the macro blocks from the VOP received.
The motion information estimated by the motion estimation block 121 is transferred into a motion compensation block 122.
An adder 123 receives the VOP, whose motion is compensated by the motion compensation block 122, and the value detected by the adder 123 is transferred into a texture coding block 124 for encoding texture information of the object as sub blocks.
For example, each of the (16.times.16) macro blocks is divided into (8.times.8) sub blocks comprising (M/2.times.N/2) pixels each along the X and Y axes of the macro block.
An adder 125 obtains the sum of the VOP motion-compensated by the motion compensation block 122 and the texture information encoded by the texture coding block 124. The output of the adder 126 is transferred into a previous reconstructed VOP block 126 for detecting the previous VOP, which is the VOP of the previous image.
The previous VOP detected by the previous reconstructed VOP block 126 is used in the motion estimation block 121 and the motion compensation block 122 so as to estimate and compensate the motion.
The VOP defined by the VOP definition block 110 is transferred into a shape coding block 127 for coding the shape information.
As indicated by dotted lines, the output of the shape coding block 127 is selectively transferred into the motion estimation block 121, the motion compensation block 122, or the texture coding block 124 for the use purpose in motion-estimating, motion-compensating, or encoding the texture information of an object. This is determined by the application field of the VOP coding blocks 120a, 120b, . . . , and 120n.
The motion information estimated by the motion compensation block 121, the texture information encoded by the texture coding block 124, and the shape information encoded by the shape coding block 127 are multiplexed by a multiplexer 128, and they are transmitted as a bit stream into the multiplexer 130 as shown in FIG. 1.
As shown in FIG. 5, a demultiplexer 210 of a VM decoder 200 divides the VOP signal encoded by the VM encoder 100 into VOPs. The respective VOP signals divided by the demultiplexer 210 are decoded into the original VOP picture by a plurality of VOP decoding Blocks 220a, 220b, . . . , and 220n, and composed by a composition block 230.
FIG. 6 is a block diagram of the VOP decoding blocks 220a, 220b, . . . , and 220n in the VM encoder 100 as decided by international standardization affiliated organizations.
The VOP encoded signal is transferred from the demultiplexer 210 into a shape decoding block 221, a motion decoding block 222, and a texture decoding block 225, decoding the shape, motion, and texture information of the VOP.
The signal decoded by the motion decoding block 221 is motion-compensated by a motion compensation block 223 and reconstructed into the original VOP by a VOP reconstruction VOP block 224.
The motion compensation block 223 compensates the motion of the current VOP by using the reconstructed image of the previous VOP transferred from a VOP memory 226. The reconstructed VOP block 224 reconstructs the VOP by using texture information of an object transferred from the motion compensation block 223 and the texture decoding block 225.
As indicated by dotted lines, the output of the shape decoding block 221 is selectively transferred into the motion compensation block 223 or the reconstructed VOP block 224 for either in compensating the motion or reconstructing the VOP. This is determined by the application of the VOP decoding blocks 220a, 220b, . . . , and 220n.
A definition block 230 receives the reconstructed VOP from the reconstructed VOP block 224 and composes the reconstructed VOP received from the VOP decoding blocks 220a, 220b, . . . , and 220n, further reconstructs the original VOP.
After an input picture is divided into designated unit blocks having predetermined shape information, each of the unit blocks are coded or decoded, which enhances compressive encoding and transmission efficiencies. The fundamental principle of this system is the same as employed in other image processing systems.
A basic block, or a unit block, having predetermined shape information comprises luminance blocks representing luminance signals, and chrominance blocks representing chrominance signals. The chrominance signals further correspond to the luminance signals.
FIG. 7 presents a general overview of a macro block structure that is a basic block forming a VOP. FIGS. 7a-7c show the macro block structures of 4:2:0, 4:2:2, and 4:4:4 formats, respectively.
To code (or decode) the macro block, bits for coded block patterns are allotted to the luminance and chrominance blocks. In 4:2:0 format, the luminance block has four sub blocks and the chrominance block has two sub blocks.
However, this coding method presents a disadvantage. Specifically, the coding efficiency is low because when boundary macro blocks are coded/decoded, VLC (Variable Length Coding) is performed irrespective of the sub block is a boundary, interior, or exterior sub block. Hereafter, a boundary sub block and an interior sub block will be referred to as "object block" in this document.
As shown in FIGS. 8a-8d, a luminance block, comprising four sub blocks is coded/decoded by using a VLC coding/decoding table. The table, however, is made in consideration of all four sub blocks. This is despite the fact that the frequency of occurrence to express the arrangement of a sub block is varied according to the arrangement characteristics of each object block and exterior sub block.
For example, as shown in FIGS. 8a-8d, four object blocks can have only one arrangement; three object blocks may have four arrangements, two object blocks may have six arrangements, and one object block may have four arrangements.
Korean Patent Application No. 95-37918 and ISO/IEC JTC1/SC29/WG11 N1469 "video VM version 5.0" disclose a method of enhancing coding/decoding efficiency by using different VLC tables according to the number of object blocks that form a luminance block. In a BBM (Boundary Block Merge) technique, disclosed in Korean Patent Application No. 96-27766, No. 96-27767, No. 96-38405, No. 97-04738, and No. 97-04739, a plurality of boundary macro blocks, or a plurality of sub blocks constituting the boundary macro blocks are merged and coded/decoded.
FIG. 9 illustrates the BBM technique for sub blocks constituting a boundary macro block.
In the BBM technique, it is suggested that to enhance the coding/decoding efficiency of unit blocks (i.e., macro blocks) wherein the unit blocks have predetermined shape information the number of object blocks may be varied. This may be performed in conjunction with other techniques using different VLC tables according to the number of object blocks. However, some problems arise in that the two techniques are so separately performed in spite of their close technical correlation that high coding/decoding and transmission efficiencies cannot be attained.