1. Field of the Invention
The present invention is directed to a method for encoding video signals, and in particular to a method for forming an information transformation matrix for an arbitrarily shaped image segment of a digital image using a computer.
2. Description of the Prior Art
The encoding of video signals according, for example, to the image encoding standards H.261, H.263, MPEG1 as well as MPEG2 is often based on a block-oriented discrete cosine transformation (DCT). These block-oriented encoding methods, however, are not suitable for image encoding which is not based on rectangular blocks but wherein, for example, subjects from an image are segmented and the segments of the image are encoded. These latter methods are known as region-based (region-oriented) or subject-based (subject-oriented) image encoding methods. A segmenting of subjects in digital images thereby ensues according to the subjects occurring in the scene. A separate encoding of these segmented subjects is implemented instead of the encoding of image blocks as in block-based image encoding methods. The encoding thereby usually ensues by modeling the segmented subjects and subsequent transmission of the modeling parameters of these segmented subjects.
After the transmission of the image information from a transmitter to a receiver, the individual subjects of the image are in turn reconstructed in the receiver on the basis of the transmitted modeling parameters.
One possibility for modeling the subjects is a series development of the image function according to a set of suitably selected basic functions. The modeling parameters then correspond to the development coefficients of these image functions. Such a modeling of the image is the basis of the transformation encoding. When individual, arbitrarily bounded subjects of the image are to be encoded, a transformation for segments with arbitrary, usually not convex, bounds is required.
Two basic approaches have heretofore existed for such a transformation.
In the method that is described in M. Gilge, T. Engelhardt and R. Mehlan, Coding of arbitrarily shaped image segments based on a generalized orthogonal transform, Signal Processing: Image Communication 1,00. 153-180, October 1989, the given image segment is first embedded into a circumscribing rectangle with the smallest possible dimensions. A discrete cosine transformation (DCT) that is completely specified by the basic functions of the transformation can be recited for this rectangle. In order to match this transformation to the segment shape, the basic functions defined on the rectangle are successively orthogonalized with respect to the shape of the segment. The resulting orthogonal, shape-dependent basic functions then form the segment-matched transformation that is sought.
One disadvantage of this approach is that there is a large capacity and a large memory space needed for the implementation of this method. Further, this known method exhibits the disadvantage that no reliable statements can be made about the resultant transformation for data compression, since the transformation is essentially dependent on the orthogonalization sequence, and thus on the specific implementation.
T. Sikora and Bela Makai, Shape-adaptive DCT for generic coding of video, IEEE Trans. Circuits and Systems for Video Technology 5, pp. 59-62, February 1995 describes a method wherein the given image segment is transformed separated according to rows and columns. To that end, all rows of the image segment are first left-justified and are successively subjected to a one-dimensional horizontal transformation whose transformation length respectively corresponds to the number of picture elements in the corresponding row. The resultant coefficients are subsequently transformed a second time in the vertical direction.
This method has the disadvantage that the correlations of the brightness values of the picture elements (similarities of the picture elements) cannot be fully exploited because of the resorting of the picture elements.
For improving this method known from Sikora et al., T. Sikora, S. Bauer and Bela Makai, Efficiency of shape-adaptive 2-D transforms for coding of arbitrary shaped image segments, IEEE Trans. Circuits and Systems for Video Technology 5, pp. 254-258, June 1995 describe a method wherein a transformation for convex image segment shapes adapted to a simple image model is implemented. Only image segment shapes that exhibit no interruptions (holes) upon traversal of rows or columns, however, are allowed in this method.
A considerable disadvantage that underlies both known approaches is that the energy concentration in the coefficients turns out lower than in the case of an optimum exploitation of all linear correlations. This is caused by the unfavorably selected basic functions given the method known from Gilge et al., the resorting of the picture elements given the first-discussed Sikora et al. article and the limitation to convex images regions given the second-discussed Sikora et al. article.
As a consequence thereof, the described, known methods do not achieve the best possible image quality at a given data rate as measured by the signal-to-noise ratio.
Further, various possibilities are known for determining the eigenvectors of a covariance matrix, for example from W. H. Press, S. Teukolsky and W. Vetterling, Numerical Recipes in Pascal, Cambridge University Press, pp. 375-389, 1992.
Various, known image transformation methods are described in J. -R. Ohm, Digitale Bildcodierung, Berlin Springer Verlag, ISBN 3-540-58579-6, pp. 46-51 and pp. 72-77, 1995.
The use of the Karhunen-Loeve transformation is known from M. Dekker, BOW "Pattern Recognition" Inc. 1984, pp. 213-217. The use of principal axis transformation is known from Ernst, Einfuhring in die digitale Bildverarbeitung, Franzis-Verlag, 1991, pp.250-252.