Digital or digitalized images are in the form of small two or three dimensional dots having a certain appearance on the screen of a display or on a printed copy.
Each digital or digitalized image is formed by an array of the dots which are called pixels in two dimensional images and voxels in three dimensional images.
Each pixel or voxel appearance can be described by means of physical variables which is a value that is transformed in a certain visual effect by a display screen or when printed on paper.
In black and white images, such as grey scale images, the different levels of grey are univocally associated to the intensity value of the beams reflected or generated by the part of an imaged body for which a pixel of the digital or digitalized image has been acquired. The intensity of each different grey level of the grey scale is univocally related to a physical parameter of the beam reflected or emitted by the body imaged, particularly to its intensity. The physical kind of the beams can be of any nature such as electromagnetic radiation in any spectral field or acoustic radiation or other kind of beams which can be generated or reflected or diffused by imaged material.
In color images normally three different values are used for defining univocally the appearance of the pixel. Different systems are known as for example the so called HSV (Hue, Saturation, Value) or RGB (Red, Green, Blue) systems. These systems are equivalent and can be differently used for univocally describing by values the appearance of the pixels or voxels.
Pixels or voxels arrays defining digital or digitalized images can so be represented by numerical two or three dimensional matrices which univocally numerically represent the image.
Methods for coding pixels or voxels of two or three dimensional images in the form of a vector are well known and use the two or three dimensional numerical representation.
Document EP 1,345,154, discloses a method for coding pixels or voxels of digital or digitalized images which uses the numerical matrices representing the pixels or voxels of the digital or digitalized image for generating a vector representing each pixel or voxel of the image.
In this document for each pixel or voxel of the digital image which is considered as a target pixel or voxel to be coded a certain surrounding of the target pixel or voxel is defined which comprises a certain number of pixels or voxels consisting in the target pixel or voxel and in a certain number of pixels or voxels surrounding the target pixel or voxel. This window is a sub array of pixels or voxels and is represented by the corresponding sub matrix of the matrix of numerical values univocally representing the pixels or voxels of the digital image.
The vector comprises as its components the numerical values which describe the target pixel or voxel and the surrounding pixels or voxels of the window.
So for example considering a grey scale image where the value representing each pixel of the image is its intensity, and defining a window corresponding to a pixel sub array having 3×3 pixels or a voxel sub array having 3×3×3 voxels, the vector comprises respectively 9 or 27 components. Considering a color image the vector has at least three values for describing each pixel or voxel and so the numeric components of the vector are 3×3×.3=27 components for a two dimensional image and 3×3×3×3=81 components for a three dimensional image.
When a greater pixel or voxel window is used the number of components increases dramatically.
Considering now a fixed image area the resolution of the digital image is given by the number of pixels or voxels per image area. So by increasing the resolution a certain image comprises more pixels.
Thus using the coding method for example for processing digitally the image a great number of numerical data has to be processed requesting high computational power and long computational duration.
In any case the known methods give surprising results particularly for example in the field of image enhancement or image pattern recognition. The above coding method is based on the idea that the meaning of each pixel or voxel of an image with reference to the quality or feature of the part of the imaged body represented by the pixel or voxel depends mainly from the spatial relation of the numerical data of the pixel or voxel with the surrounding pixels or voxels.
In the field of digital image processing this principle has been applied for obtaining several different results.
It is known for example the use of eigenvalues of the matrices of numerical data representing a target pixel or voxel window comprising a target pixel or a target voxel of an image for somehow representing the target pixel or voxel or certain relationship of the target pixel or voxel relatively to the other pixels or voxels of the window.
Furthermore some image processing operators have been developed for recognizing edges or corners in digital images in the so called image pattern recognition methods.
These operators typically work as summarized above by defining each pixel or voxel of a digital image as a target pixel or voxel, defining further a pixel or voxel window of generic size n×m (typically n=m) comprising the target pixel or voxel and a certain number of surrounding pixels or voxels and by applying a certain transformation of the matrices of numerical values representing each pixel or voxel window.
Document “Neural Network for robot image feature classification, A comparative study” NEURAL NETWORK FOR SIGNAL PROCESSING, Y1994, IV Proceedings of the 1994 IEEE Workshop, Ermioni, Greece 6-8 Sep. 1994, New York, N.Y., USA, IEEE by Sharma V. R. Madiraju et al, discloses a feature extractor which is trained to identify features such as lines, curves, junctions or other geometrical shapes in images.
The feature extractor is based on a base of a certain number of features models generated so as to include a model for each of a wide variety of edges types. This is consistent with the aim of the technique which has to enable a robot to recognize shapes and imaged objects.
The models are 3×3 pixel windows centered on a center pixel which is the pixel of interest of a digital image. In order to describe the features in a rotational invariant way a feature descriptor is used being the eigenspace of the covariance matrix corresponding to the 3×3 pixel window.
As it appear clearly this kind of models are a sort of filter which is aimed to identify the geometrical structures present in the image by determining if a pixel of an image is part of this geometric structure or not. The so identified geometrical structures may be used for recognizing the imaged object by the shapes identified in the image. The decision of whether a pixel is part of a geometric structure or shape such as an edge, a corner, a curve or similar is made by using an artificial neural network. The result given by the neural network is merely the feature of the pixel limited to the fact that the pixel is part of an edge, part of a corner or of a line or of a curve or other geometrical structures. No information is obtained relating the quality or feature of the part of the real imaged object which is represented by the pixel in the image. The processing according to the above identified document is limited to mere “pictorial” features.
So, for example, the application of these methods to edge detection uses the so called gradient matrix defined with more detail in the following description. The use of gradient matrices is known for example in Introductory techniques for 3-D Computer Vision, E. Trucco and A. Verri, Prentice Hall, 1998.
Another operator called the Hessian matrix which corresponds to the second derivative of the original matrix of numerical data describing the pixel or voxel window, by means of its eigenvalue description, is used as image processing operator for example for enhancing the salient features of image detail (Jiri Hladuvka, Andreas Konig, and Eduard Groller, Exploiting Eigenvalues of the Hessian Matrix for Volume Decimation, In Vaclav Skala, editor, 9th International Conference in Central Europe on Computer Graphics, Visualization, and Computer Vision (WSCG 2001), pages 124-129, 2001).
Differently from the image processing method disclosed in EP 1345154, which due to the special way of encoding the pixels of an image can provide a classification of the features of a part of an imaged object represented by a pixel in the image, the edge detection methods limit their function to the classification of the pixels of an image relating to a certain geometrical structure to which the pixel belongs or that the pixel represents in the image. Thus considering the edge detection method and similar methods these are not able or directed to classify a pixel of an image in order to have an information or a prediction about the quality or feature of the part of the imaged real object which is represented by the pixel in the image. The use of the eigenvalues of the covariance matrix or other parameters of other functions of matrix of parameters related to the pixels of a window describes only a certain model of a geometrical structure to which the pixels belongs. Considering instead the method disclosed in document EP1345154 it appears clearly that the aim is to obtain information on a quality or a feature of a part of a real object which part is represented by a certain pixel in an image of the real object by processing the parameters describing the appearance of the pixel in an image representing the object. The current edge detection techniques do not deal with this technical problem, nor the models used by these methods are even suspected to be able to help in carrying out the above classification task.
In the case of a diagonalizable (2D) matrix the eigenvalues are representative of the matrix and her properties. For example, the rank, which is one of the most important properties of a (2D) matrix, is characterized by eigenvalues: in fact, for diagonalizable (2D) matrices, the number of non-zero eigenvalues is equal to the rank.
Gradient and Hessian matrices are, in particular, diagonalizable (2D) matrices, then we can characterize them by means of their eigenvalues.
This is in general not true for other (2D) matrices. By means of the present invention we can overcome this problem by considering the singular values of the (2D) matrix (D. Bini, M. Capovani, O. Menchi, “Metodi numerici per l'algebra lineare”, Zanichelli, Italy). In fact singular values are representative of the (2D) matrix, even if the matrix is not diagonalizable. For example, the number of non-zero singular values is equal to the rank for every (2D) matrix.
A generalization exists for 3D matrices: in fact, for a generic M×N×K 3D matrix, it is possible to find N+M+K generalized singular values characterizing the matrix (A multilinear singular value decomposition, Lieven De Lathauwer, Bart De Moor, Joos Vandewalle, SIAM Journal on Matrix Analysis and Applications, Volume 21, Number 4, pp. 1253-1278).
Other processing methods are used for treating image data, inter alia let us recall such methods as the wavelet transforms the autocorrelation transforms and the co-occurrence matrix transforms.
Wavelet transform is typically used for image compression.
The wavelet transforms allows to represent an image data array by a set of basis functions. The use of a subset of the basis functions allows a reduction of the parameters which are relative to the relevant image information. Thus a compression of the image data can be obtained without a significant loss of salient features.
The wavelet transform is typically calculated on a window having dimension 2^n×2^n; wavelet transform of a window of any size can be calculated at the expense of loss of information at the window boundary. In order to characterize a single pixel by means of a wavelet transform we can construct 4 windows around the target pixel, these four windows having the target pixel respectively at the bottom left corner, at the bottom right corner, at the top left corner, at the top right corner, and each window having dimension 2^n×2^n. We can thus code the target pixel by using one or more of the coefficients of the wavelet transform of these four numerical matrices.
The autocorrelation and the co-occurrence transforms of the image data provide a set of parameters which are somehow significant of the image texture information.
Nevertheless all the known coding methods are limited to the fact that pixels or voxels are always coded only by using the numeric values of the pixels or voxels of the neighborhood defined by a pixel or voxel window comprising a target pixel or voxel to be coded.