The present invention relates to a method of coding segmented pictures, or partitions, divided into homogeneous regions to which specific labels are respectively associated, and to a corresponding device. This invention has mainly applications in relation with the MPEG-4 standard, for the implementation of MPEG-4 encoders.
Conventional methods of coding segmented picturesxe2x80x94or partitionsxe2x80x94are usually considered as rather expensive in terms of amount of bits, e.g. an average value of 1,3 bit per pixel of contour with the so-called chain code technique. Lossy shape coding techniques have therefore been proposed, but losses in shape information often result in an unacceptable degradation of the subjective quality of the image displayed on the basis of the decoded picture.
Quasi-lossless shape coding techniques (and therefore, by extension, quasi-lossless partition coding techniques) have then been proposed. For instance, the so-called multiple grid chain code approach (MGCC), described in the document xe2x80x9cMultiple grid chain coding of binary shapesxe2x80x9d, by P. Nunes, F. Ferreira and F. Marquxc3xa9s, Proceedings of International Conference on Image Processing, vol.III, pp. 114-117, Oct. 26-29, 1997, Santa Barbara, Calif., USA, allows to efficiently encode binary shape information of video objects in the context of object-based video coding schemes. Said approach also allows general partition coding, while introducing very few and controlled losses, only restricted to isolated picture elements (pixels) belonging to the boundaries of the regions of each partition.
This conventional MGCC approach relies on a contour representation of the partition. As shown in FIG. 1 that depicts a small general partition of size Nxc3x97M with for example three regions to which respective labels (in the present case, represented by the grey, black and white circles) are associated, any pixel has four different contour elements associated. FIG. 2 shows the same partition as depicted in FIG. 1, but with the indication of the specific contour elements that define the changes between every pair of neighbour pixels that do not belong to the same region. FIG. 3 shows in a corresponding matrix of (2N+1)xc3x97(2M+1) sites the implementation of both grids: that associated to the pixel sites (=the circles) and that related to the contour sites (=the segments of lines). The contour elements that are situated between pixels of different labels are considered as active.
On turn, an element of the contour grid may have up to six active neighbours, as illustrated in FIG. 4, and, due to that, the contour grid is usually referred to as hexagonal grid. A way to code the partition information in the contour grid is to select an initial point in the grid and to track the active sites up to the end of the corresponding contour. This method performs a lossless coding of the partition information, by encoding the movement from a current contour element to the following neighbour contour element (only three possible movements: straight, right, left).
Another contour tracking method is to move through the contour using larger steps, only contour elements linked by such larger steps being encoded: in the above-cited document, the described MGCC technique uses basic cells of 3xc3x973 pixels, as the one illustrated in FIG. 5, where all contour and pixel sites in the cell are shown. For compression efficiency reasons, two types of cells are then used: counter-clockwise, as in FIG. 6, or clockwise, as in FIG. 7. The way to index the different contour elements in each type of cell is presented: the initial contour site is denoted by the symbol 0 and the other ones by the symbols 1 to 7, the symbol 1 being assigned to the site which is on the same side of the cell as the symbol 0. Consequently, to characterize a cell, three parameters are necessary: its initial contour site, its type (clockwise or counter-clockwise), and its orientation (=east or west for a cell with an horizontal initial contour element, north or south for a vertical one). The coding algorithm selects between each of these two types of cell in order to maximize the number of contour elements coded per cell.
The MGCC technique uses said indexing as well as its possible rotations. Starting by the input element of the cell indexed with a 0 in FIG. 6, any output element from the set (1, 2, . . . , 6, 7) can be reached, but the way to go through the cell is not univocally defined by the output element: as shown in the example of FIGS. 8 and 9, a movement (in this case, from 0 to 4) may indeed correspond to two different contour configurations. The contour elements inside the cell (8, 9, 10, 11), not coded, introduce ambiguity in the coding process, two different sets of contour sites (0, 8, 9, 4) or (0, 11, 10, 4) being possible. This ambiguity introduces coding losses. Nevertheless the only possible error is the erroneous labeling of the central pixel of the cell, i.e. only of an isolated boundary pixel.
In a contour tracking process, several cells are then linked up in order to complete the contour. To link two cells, the output contour site of the current cell becomes the input contour site of the following cell, as FIG. 10 shows. When coding with the MGCC approach the boundary of a single region, the bitstream thus generated will therefore contain: (a) the position of the initial contour site, and (b) the chain of symbols representing the movements performed in order to track the contour.
In the contour tracking process, it is needed to change progressively the position of the basic cell through the grid. The proposed technique is described for instance in the article xe2x80x9cEncoding of line drawings with a multiple grid chain codexe2x80x9d, by T. Minami and K. Shinohara, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-8, no.2, Mar. 1986, pp. 269-276. The basic principle is explained with reference to FIGS. 11 to 15.
As indicated in FIG. 11 giving an example of coding with a fixed grid, a contour segment has been here coded by means of the symbol 4 in the first cell (a counter-clockwise one). If cells of the same grid are used, the symbol 7 (FIG. 12) is used in the second cell for tracking the contour (the symbol 0 being always the new initialxe2x80x94or inputxe2x80x94site), and the symbol 2 (FIG. 13) in the third one, these second and third cells being in that case clockwise ones. Three cells are therefore needed to track the contour from the first initial site. On the contrary, only two are needed if the grid (i.e., in fact, the position of the center of the cell) is changed: as shown in FIGS. 14 and 15 that illustrate a modification of the grid with respect to the example of FIGS. 11 to 13, only a second cell is now indeed needed to go to the same output site.
This solution of FIGS. 14 and 15 leads to a more compact representation of the contour. However, three different classes of grids are then necessary to define the shift of each grid with respect to the origin of the corresponding cell (called GO) before said shift. These three classes G1, G2, G3 are defined, as indicated in the following classification table, by the position (x, y) of the pixel which is the origin of each type of cell, with respect to the pixel which is the origin of said corresponding cell:
In the example of FIG. 10, a cell of the class G2 was used (with respect to the corresponding current cell G0); in another example illustrated in FIG. 16, a cell of the class G1 is used (with respect to the corresponding current cell G0).
However, the MGCC approach previously described can only be used to code binary partitions. In case of general segmented pictures, the partitions contain regions that share contours, and said approach is no longer appropriate.
It is therefore an object of the invention to propose an improved method also using the type of contour grid previously defined, but leading to a more general and more efficient coding of the contour segments of the regions of a picture.
To this end, the invention relates to a method such as presented in the preamble of the description and which is moreover characterized in that it comprises, for each successive partition, the steps of:
(a) translating the picture of labels into a description in terms of a contour element chain in which the elements are defined by means of their movements through successive basic cells, between an input point and an output point of said cells
(b) tracking inside each successive cell each contour from its initial contour point, previously extracted, to its end by storing chain symbols corresponding both to input, internal and output contour elements of said cell and to priorities between possible multiple outputs elements
(c) repeating these steps up to the end of each successive contour segment of the concerned partition
(d) coding the information corresponding in each cell to the initial point of each contour segment and to the associated chain of movements between that initial point and the initial one of the following cell; said successive steps defining a so-called intra-mode coding process of image partitions.
In this method, the concept of triple point at the intersection of two different contours is introduced, which allows to locate the initial point (or starting contour site) of a new contour (not yet coded) with respect to a previous one, and, by introducing for these triple points a new symbol in the coding chain, a more efficient coding is obtained: while the MGCC approach tracks the contour chaining cells, triple points will be replaced, in the present case, by the concept of cells with multiple outputs.
With respect to this basic method, two main extensions are possible, the first one to the case of scalable partition sequence coding and the second one to the case of inter-mode partition sequence coding.
In the first one of said improved embodiments of the invention, the method is moreover characterized in that each current partition is divided into a basic layer and at least an enhancement one, said intra-mode coding process being then successively applied to the basic layer without any change and to the enhancement layer with the following modifications:
in the operation of extraction of the initial contour points, the initial points of contour segments are associated to contour points from the basic layer;
in the tracking step, all the points belonging to the basic layer are withdrawn
in the repeating step, the closing of a contour before processing the following one is associated to contour points of the basic and enhancement layers.
In the second one of said improved embodiments, said coding method is characterized in that each current partition is divided into a first part of the partition, including some regions that need to be coded in intra-mode, and a second part of the partition, corresponding to the other regions to be motion compensated, said intra-mode coding process being then applied to said first part playing the role of the basic layer and an associated inter-mode coding process being correspondingly applied to said second part playing the role of the enhancement layer.