1. Field of the Invention
The present invention generally relates to object-shape coding apparatuses for coding a binary image representing an object shape by the unit of one rectangular block where the binary image is divided into a plurality of rectangular blocks, and particularly relates to an object-shape coding apparatus for coding rectangular blocks which includes both the pixels of the interior of the object shape and the pixels of the exterior of the object shape.
2. Description of the Related Art
In recent years, interest has been high in object-based coding schemes such as ISO/IEC 14496-2: “Information Technology-Generic Coding of Audio-Visual Objects-Part2: Visual.” The object-based coding divides an original image into the images of objects such as people or the like in the foreground and objects in the background, and attends to image coding with respect to each object image separately. The object-based coding can achieve a higher coding efficiency than coding schemes based on the coding of image frame units such as the MPEG-2 video coding standard (ISO/IEC 13818-2: “Information Technology-Generic Coding of Moving Pictures and Associated Audio Information: Video”). Further, use of object-based coding provides a basis for making of a video by combining objects.
An object image is comprised of texture images and object-shape data. In the object-base coding, therefore, both the texture coding and the shape coding are performed. Shape data includes binary data of shape information that only represents shape, and further includes multi-level data of shape information that represents object transparency. The present invention relates to the binary data of shape information.
In the following, related-art methods for binary shape coding will be described.
There are two types of methods for representing object shapes. One is to use a bit pattern image that has binary values representing whether pixels are inside or outside the object boundary, and the other is to show only the object boundaries. Accordingly, object-based coding apparatuses can also be classified into two groups, one for coding binary bit pattern images and the other for coding contour data.
Methods for coding binary bit pattern images attend to binary information coding by following the order of image scanning. Typical coding methods include the JBIG standard (ISO/IEC 11544: “Progressive Bi-level Compression”) and the MMR (modified modified read) coding standard (ITU-T T.6: “Facsimile Coding Schemes and Coding Control Functions for Group 4 Facsimile Apparatus”). The JBIG standard encodes binary data in a hierarchical manner by following the order of image scanning. The MMR standard encodes positions where binary pixels undergo changes in values, which is performed by following the order of image scanning. Both of these two coding methods are loss-less processes.
Methods for coding contour information attends to coding by following the order of points that make up the contour. Such methods include one that encodes directions of points that constitute the contour, and include one that reversibly encodes the coordinates of points that constitutes the contour. Among these, a chain coding scheme (Makoto Nagao, “Digital Image Processing,” Kindaikagaku, pp.384-385, 1987) assigns integers 1 through 8 to directions of connections relating the points that constitute the contour, and attends to reversible coding. Further, there is a method that carries out hierarchical coding by using the chain coding scheme (Tohru Kaneko, “Hierarchical Coding Scheme for Line Drawings Described by Chain Code Series,” The Transactions of the Institute of Electronics, Information and Communication Engineers, Vol. J69-D, No. 5, 1986).
Further, methods for coding contour information include approximating for the contour by using the Spline function (Myron Flickner, et al., “Periodic Quasi-Orthogonal Spline Bases and Applications to Least-Squares Curve Fitting of Digital Images,” IEEE Transaction on Image Processing, vol. 5, No. 1, pp. 71-88, January. 1996), and also include a method using Wavelet descriptors (George Muller, et al., “Progressive Transmission of Line Drawings Using the Wavelet Transform,” IEEE Transaction on Image Processing, vol. 5, No. 4, pp. 666-672, April 1996). Also included is a method that uses Wavelet descriptors for contour direction vectors (Japanese Patent Laid-open Application No. 11-255420).
All the binary shape coding methods as described above encode object shapes by the unit of one frame.
In general, texture coding is conducted by the unit of one rectangular block after an original image is divided into a plurality of rectangular blocks. Among texture information within a given rectangular block, information is useful where it corresponds to the area of the object defined by the shape data. In order to keep consistency between the texture coding and the shape coding, some shape coding schemes employ division of an image into a plurality of rectangular blocks, and attend to block-specific coding.
The binary shape coding of the MPEG-4 standard divides a binary shape image into a plurality of rectangular blocks (macro blocks) of 16×16 pixels where the binary shape image is comprised of shape interior pixels and shape exterior pixels, and attends to coding on the block-specific basis. The MPEG-4 standard is applicable to intra-frame coding as well as inter-frame coding. In the following, the intra-frame coding will be described.
In the intra-frame coding, a coding mode is selected based on the conditions of the rectangular block, i.e., based on whether all the pixels of the rectangular block are those of the shape interior, whether all the pixels are those of the shape exterior, and whether the shape interior pixels and the shape exterior pixels are both present inside the rectangular block. When all the pixels are shape interior pixels, or are shape exterior pixels, only the coding mode is transferred, without coding of each pixel. When the shape interior pixels and the shape exterior pixels are both present, a coded word is assigned to each pixel through arithmetic coding.
The arithmetic coding is a type of a variable length coding scheme that reduces the quantity of information by utilizing disparity of symbol occurrence probabilities. In this coding scheme, a probability line segment is segmented according to the probabilities of occurrences of a symbol series, and a binary decimal value indicative of a position within a segmented section is used as a code for the symbol series (Hiroshi Harashima, “Image Information Compression,” Ohm, pp. 153-161, 1992.7). In the arithmetic coding, segmentation of a probability line based on probabilities of occurrences of a symbol series can be consecutively made through arithmetic operations, which achieves a compression efficiency that is close to the entropy limit of the symbol series.
The Huffman coding is known as a variable length coding scheme that reduces the quantity of information by utilizing inequality of symbol occurrence probabilities in the same manner as in the arithmetic coding (Hiroshi Yasuda, Hiroshi Watanabe, “Basics of Digital Image Compression,” Nikkei BP Publishing Center, pp. 32-35, 1996). In the Huffman coding, one coded word is assigned to one symbol. Since the Huffman coding only requires reading a coded word for a given symbol from the coded word table stored in memory, a coding apparatus can be implemented as a small size apparatus.
As described above, the MPEG-4 arithmetic coding has macro blocks of 16×16 pixels as input thereto, and attends to consecutive segmentation of a probability line segment for 256 pixel symbols. In general, coding efficiency increases as the processing block becomes bigger, but an increase in the processing block size entails needs for increased computation and increased memory. This is one of the factors that make it difficult to develop a real-time coding apparatus for an image of a large size such as an HDTV image.
In order to reduce the computation load and the memory volume, input data may be coded by the unit of a small data size. Since real-time processing is performed by use of hardware, however, correlation within the data cannot be fully utilized if the coding is performed by the unit of a small data size. In order to obviate this problem, it is desirable to provide a coding apparatus that can achieve efficient coding while avoiding an increase in the size of hardware for code assigning process.
Accordingly, there is a need for an object-shape coding apparatus that can achieve efficient coding while avoiding an increase in the size of hardware for code assigning processing where the object-shape coding apparatus divides a binary image representing an object shape into a plurality of rectangular blocks, and encodes each of the rectangular blocks separately, including a rectangular block which includes both object interior pixels and object exterior pixels.