The present invention concerns, in general terms, digital signal coding and proposes to this end a device and method for coding a digital signal by decomposition into frequency sub-bands of the signal, and segmentation of certain frequency sub-bands. It also concerns the transmission of the coded signal and also a decoding method and device corresponding to the coding device and method.
The purpose of coding is to compress the signal, thus making it possible to transmit the digital signal or to store it by reducing the quantity of binary symbols necessary for representing it. The coding can be without loss, that is to say it keeps all the information contained in the digital signal, or on the other hand with loss, that is to say some information contained in the digital signal may be degraded.
The present invention is applicable in each of the above two types of digital signal coding. Hereinafter, the coding of digital images or video sequences will be dealt with more particularly. A video sequence is defined as a succession of digital images. It is particularly adapted to the storage of images in data bases and to their transmission over a network to a number of distant items of equipment.
It is known that a digital signal can be decomposed into frequency sub-bands before compressing it. Decomposition consists of creating, from the digital signal, a set of sub-bands each containing a limited frequency spectrum. The sub-bands can be of different resolutions, the resolution of a sub-band being the number of samples per unit length used for representing this sub-band. In the case of a digital image signal, a frequency sub-band of this signal can be considered to be an image, that is to say a bidimensional table of digital values.
It should be noted that the decomposition of a signal into frequency sub-bands creates no compression in itself, but makes it possible to decorrelate the signal so as to eliminate the redundancy existing in the digital image prior to the compression proper. The sub-bands are then compressed more effectively than the original signal.
Conventionally, the coding of a digital signal, in this case of a digital image, includes three steps. The image is first of all decomposed by a transformation into frequency sub-bands, the coefficients thereof are quantised as indices and finally these indices are coded by means of an entropic coding without loss.
This type of compression makes it possible to obtain a relatively high degree of compression of the signal but does not make it possible to access the content of the image. In other words, the decomposition of the signal remains purely of the frequency type, and gives no information about the objects which may be contained in the image. Object means an entity of the image corresponding to a semantic unit, for example the face of a person. An object can comprises one or several regions of the image. In the following, notions of object and region will be considered as equivalent.
Coding using a decomposition into sub-bands of the signal is by nature progressive by sub-band, and therefore allows transmission of the coded data which is progressive by sub-band.
There also exist other image compression techniques based on the segmentation thereof. In this context, the image is considered to consist of objects with two dimensions. Segmentation is a low-level process whose purpose is to effect a partitioning of the image into a certain number of subelements referred to as regions. The partitioning is such that the regions are separate and combining them forms the image. The regions correspond or do not correspond to objects in the image, the term object referring to an item of information of a semantic nature. Very often, however, an object corresponds to a region or set of regions. Each region can be represented by an item of information representing its shape, colour or texture.
Conventionally, a method of compressing a digital image based on a segmentation includes a first so-called marking step, that is to say the interior of the regions having local homogeneity is extracted from the image. Next, a decision step precisely defines the contours of the areas containing homogeneous data At the end of this step, each pixel of the image is associated with a label identifying the region to which it belongs. The set of all the labels of all the pixels is conventionally referred to as a segmentation map. Finally, in such a coding, the last step consists of coding the segmentation map, generally in the form of contours of the regions, and pertinent parameters representing the interior of the regions, such as the texture and the colour.
This type of technique makes it possible, for a given image, to obtain a higher degree of compression than with the technique previously described. This is because, with segmentation, the compression can be effected selectively on the object or regions judged to be the most important, to the detriment of the others. Thus, for a given degree of compression, that is to say for a number of binary elements allowed, a precise object (typically the face of a person in an image of the xe2x80x9chead and shouldersxe2x80x9d) type, can be coded precisely using a maximum number of bits, to the detriment of the background, which for its part will be coded with a minimum number of bits.
Segmentation allows progressive coding by regions, and consequently transmission of the coded data which is progressive by regions.
This type of technique, however, does not make it possible to have multiresolution information as permitted by the methods with decomposition into sub-bands.
Other techniques combine the two compression methods described above, such as for example the standard known as MPEG4 (from xe2x80x9cMotion Picture Expert Groupxe2x80x9d), which is currently being converted into an 30 ISO/IEC standard. In the MPEG4 coder, more particularly in the case of the coding of fixed images, the decomposition of the image into frequency sub-bands is used conjointly with a segmentation of the image. A step prior to the coder (not standardised) is responsible for isolating the objects in the image (Video objects) and representing each of these objects by a mask. In the case of a binary mask, the spatial support of the mask has the same size as the original image and a point on the mask at the value 1 (or respectively 0) indicates that the pixel at the same position in the image belongs to the object (or respectively is outside the object).
For each object, the mask is then transmitted to a shape decoder whilst the texture for each object is decomposed into sub-bands, and the sub-bands are next transmitted to a texture decoder.
This method has a certain number of drawbacks. This is because it is necessary to code the mask for each object at its highest resolution level (in Version 1 of MPEG4) or in certain cases at two resolution levels (Version 2 of MPEG4). For a given degree of compression, this impairs the quality of the reconstructed image, since it is necessary to reserve output for the masks of the objects, which are at a high resolution. Moreover, the number of objects handled is a priori the same at all levels, whilst it may be more advantageous to have a number of objects increasing with the (spatial) resolution, that is to say a true conjoint scalability between the resolution and the number of objects.
The present invention aims to remedy the drawbacks of the prior art, by providing a method and device for compressing a digital signal which offer a high compression ratio whilst allowing progressive transmission of the content of the image, both in resolution and by objects.
To this end, the invention proposes a method of coding a set of data representing physical quantities, characterised in that it includes the steps of:
decomposing the set of data into a plurality of frequency sub-bands on at least one resolution level,
coding the sub-bands,
then, for each resolution level,
segmenting at least one sub-band into at least two homogeneous regions, in order to form a segmentation map,
ordering the regions according to a predetermined criterion,
ordering the coding data of the sub-bands as a function of the order of the regions.
Correlatively, the invention concerns a device for coding a set of data representing physical quantities, characterised in that it has:
means of decomposing the set of data into a plurality of frequency sub-bands on at least one resolution level,
means of coding the sub-bands,
means of segmenting, for each resolution level, at least one sub-band into at least two homogeneous regions, in order to form a segmentation map,
means of ordering the regions according to a predetermined criterion,
means of ordering the coding data of the sub-bands as a function of the order of the regions.
By virtue of the invention, the compression ratio is high. This is because the decomposition into sub-bands makes it possible to decorrelate the signal, and the segmentation on a particular sub-band is thus more effective. In addition, it is possible to effect a progressive transmission on the content of the image, both in resolution and by object.
According to a preferred characteristic, the invention includes the coding of the segmentation map of at least one resolution level, preferably the lowest resolution level in the decomposition.
This coded segmentation map is attached to the coded data, and is then used at the time of decoding for ordering the data.
According to a preferred characteristic, the sub-bands are coded and then decoded prior to segmentation. In the case of coding with loss, the coding and subsequent decoding are effected on the same data. Thus, the segmentation is performed on the same data and the results are identical on coding and on decoding.
According to another preferred characteristic, the segmentation is effected on the sub-band with the lowest frequency of the resolution level under consideration. This sub-band contains more information than the other sub-bands and allows more pertinent segmentation of the data. The segmentation map is smaller compared with a full-resolution segmentation. It is consequently faster to determine, and requires a reduced transmission rate if it is to be transmitted.
According to a preferred characteristic, the scheduling criterion depends on an analysis of the segmentation. Thus the regions are ordered according to their importance.
According to a preferred characteristic, the invention also includes the transmission of the segmentation map determined at the lowest resolution level and the coding data of all the sub-bands, for all the resolution levels.
This is because, according to the invention, only the segmentation map of the lowest resolution level must be transmitted, whilst the other segmentation maps, necessary for the higher resolution levels if such exist, are calculated at the time of decoding.
The invention also concerns a method of decoding data representing physical quantities coded by the coding method according to the invention, characterised in that, for a given resolution level, it includes the steps of:
analysis of the segmentation in order to classify the regions according to a predetermined criterion,
decoding the coding data of the sub-bands of the resolution level under consideration as a function of the result of the previous step,
reconstructing the sub-bands,
synthesising the reconstructed sub-bands.
The invention also concerns a decoding device having means of implementing the above characteristics.
The decoding method and device make it possible to reconstruct the signal, for example in a receiving apparatus corresponding to a transmitting apparatus in which the signal was coded according to the invention.
Additionally, the invention allows a selective coding by object. To this end, the invention proposes a method of coding a set of data representing physical quantities, characterised in that it includes the steps of:
decomposing the set of data into a plurality of frequency sub-bands on at least two resolution levels,
then, for at least one resolution level,
segmenting at least one sub-band of the resolution level under consideration into at least two homogeneous regions,
ordering the regions according to a predetermined criterion,
coding by region the coefficients of the resolution level under consideration,
and, for at least one resolution level, except the highest resolution level,
decoding the coded coefficients of the resolution level under consideration, in order to form decoded sub-bands, and
synthesising the decoded sub-bands of the resolution level under consideration, on one resolution level, in a synthesised sub-band which will be considered at the following iteration.
Thus, the invention allows a selective coding by object. The quality is higher for some selected objects, and lower for some other objects, for a given xe2x80x9cglobalxe2x80x9d bit rate.
The invention also concerns a method for transmitting a set of data representing physical quantities, comprising the above coding method, and comprising the transmission of the coded coefficients.
The invention also proposes a device comprising means for implementing the above features.
The invention also concerns a digital apparatus including the coding or transmitting, respectively decoding, device, or means of implementing the coding or transmitting, respectively decoding, method. The advantages of the device and of the digital apparatus are identical to those disclosed above.
An information storage means, which can be read by a computer or microprocessor, integrated or not into the device, possibly removable, stores a program implementing the coding, or respectively decoding, method.