1. Field of the Invention
The present invention relates to a method for segmentation of a digital picture consisting of a multiplicity of single picture elements which is in particular suitable for digital picture processing and/or object and pattern recognition.
2. Description of the Prior Art
In most fields of application of digital picture processing and pattern recognition it is necessary to recognize and classify pertinent structures in a picture consisting of a multiplicity of single picture elements. An important step therefore consists in combining pertinent, contiguous picture elements into coherent picture segments. This technique is referred to as segmentation. Through their shape and texture properties the picture segments thereby found offer substantially more information for a subsequent classification than single picture elements, with this information being strongly dependent on the quality of segmentation.
The essential criteria for the segmentation of picture elements are contiguity and so-called homogeneity criteria which determine whether or not a picture segment, following the combination of picture elements, is judged to be homogeneous. A homogeneity criterion determines whether or not combination is performed. In a homogeneity criterion for example specific texture properties or a homogeneity of features can be defined.
One particular difficulty in segmentation of a picture is caused by the texture of picture regions which occurs in very many pictures in most variegated degrees. Characteristically, almost all object types are more or less textured. A like texture is, besides other properties for example seediness or grain, essentially characterized by a higher or lower degree of distribution of features, for example a gray value distribution or color distribution. Such distributions of feature values may be greater or smaller for various objects and may also overlap.
For example in FIG. 17 a case is represented wherein altogether six picture objects each presenting a differently sized range of gray values ranging on a scale between 0 and 256 are shown, wherein the respective range of gray values of the picture objects represented above partly or entirely overlaps with the one of the picture objects represented below.
This causes considerable difficulties in segmentation as on the one hand, textured picture objects, which in part present very different feature values, are to be segmented entirely, whereas on the other hand segmentation is performed as a general rule on the basis of similarities of features. Where the differences of features within an picture object are too large, segmentation will become problematic owing to similarity of features. At the same time it is generally difficult to separate two contiguous picture objects having overlapping feature distributions.
In many segmentation methods of the prior art the similarity or pertinence of picture elements and picture segments is determined for example by means of so-called one- or more-dimensional threshold techniques, wherein the feature space defined by the features of the picture elements is classified into partial ranges. The homogeneity criterion is satisfied in these known methods whenever the picture elements are contained in a same partial range. Thus for example the 256 gradations of gray in which a digitized picture is frequently present, may be classified into 16 regions each having 16 gray values. If two contiguous picture elements are in a same range of gray values, they are judged to be similar and are therefore segmented. This does, however, often solve the above described difficulties only very insufficiently. Even where certain objects are well described by the classification of the regions, this classification may fail altogether for other picture objects.
Other methods attempt to combine picture segments based on specific predetermined texture features which leads to a texture-based segmentation. This works more or less well for specific textures in particular picture regions, however often worse in other picture regions. These methods at the same time necessitate beforehand knowledge about the picture and its textures, and furthermore do not enable any desired resolutions.
Segmentation methods utilizing watershed transformation employ as the basis for segmentation a representation of the color gradients in the picture, begin segmentation in the most homogeneous picture regions, and successively, i.e. by and by, expand the segments into more heterogeneous picture regions. In this way homogeneous picture regions are well segmentable, uniformly heterogeneous picture regions however with more difficulty. Simultaneous segmentation of homogeneous and heterogeneous picture regions, or of picture regions having different degrees of heterogeneity, can be performed only with great difficulty. In addition the method disregards the original color information.
Techniques performing pixel classification operate along the principle of determining, based on beforehand knowledge, intervals or distributions in the feature space which are known to be characteristic for particular object classes. In this case the examined picture elements are each allocated to a respective class with which they present the highest probability of pertinence in accordance with their vector in the feature space. The homogeneity criterion is in this case defined as pertinence to the same object class. Apart from the fact that much less information for classification is available about single picture elements than about picture segments, a particular difficulty is encountered in processing aerial and satellite pictures owing to the fact that the distribution of the features for particular object types depends very strongly on weather, time of the day and season, as well as light conditions at the time when the picture was taken. Accordingly a suitable preliminary definition of typical distributions for particular object classes in the feature space is very difficult.
From the article J. C. Tilton: xe2x80x9cHYBRID SEGMENTATION FOR EARTH REMOTE SENSING DATA ANALYSISxe2x80x9d, IEEE International, vol. 1, pp. 703 to 705, there is known a method for segmentation of a digital picture which uses a combination of region growing and boundary detection. In a first step of this method edge boundaries are detected and in a second step of this method region growing is performed by not allowing to grow regions past edge boundaries defined by the boundary detection. However, this method has disadvantages in that the maximum size of regions to be grown is the size defined by the edge boundaries. Thus, this method fails to provide the possibility to grow regions having similar features past edge boundaries which are in fact no edges of different picture objects but result from textures or the like.
It is therefore one object of the present invention to provide a method for segmentation of a digital picture which ensures an excellent segmentation or object recognition even in cases where respective objects have overlapping feature ranges or are characterized by feature values which are liable to strongly vary under different conditions.
In accordance with a first aspect of the present invention there is provided a method for segmentation of a digital picture consisting of a multiplicity of single picture elements comprising (a) determining if one of one and several features relating to contiguous picture objects comprising picture elements and picture segments are conforming or not conforming based on a specific homogeneity criterion by means of referencing a predetermined tolerance for each feature as a termination criterion, within which feature values relating to the contiguous picture objects in question may differ; (b) if one of one feature and several features relating to the contiguous picture objects are determined to be conforming then merging the conforming picture objects; and (c) repeating the resulting segmentation until the resulting segmentation converges in a stable or approximately stable condition in which no further contiguous picture objects are determined to be conforming.
In a preferred embodiment of the present invention a feature difference to be compared in the homogeneity criterion is determined via heterogeneity introduced by merging two picture objects by determining a difference xcex94hw between heterogeneities of respective picture objects weighted with the size of the respective picture objects after and before merging so that homogeneity is expressed by the formula
xcex94hw=(n1+n2)hnewxe2x88x92(n1h1+n2h2) less than xcex1
wherein xcex1 is the predetermined tolerance, h1 and h2 are heterogeneities of the respective picture objects, n1 and n2 are sizes of the respective picture objects and hnew is the heterogeneity of a potentially newly formed picture object.
In a further preferred embodiment of the present invention the heterogeneity of the potentially newly formed picture object is defined as standard deviation of color mean values of the respective picture objects as expressed by the formula
xcex94hw=xcex94"sgr"w=(n1+n2)"sgr"newxe2x88x92(n1"sgr"1+n2"sgr"2) less than xcex1
wherein "sgr"1 and "sgr"2 are standard deviations of the respective picture objects and "sgr"new is the standard deviation of the potentially newly formed picture object.
In a further preferred embodiment of the present invention a predetermined value for the standard deviation is used for small picture objects having a size of approximately one to five picture elements.
In a further preferred embodiment of the present invention the heterogeneity of the potentially newly formed picture object is defined as the variance of color mean values of the respective picture objects as expressed by the formula
xcex94hw=xcex94varw=(n1+n2)varnewxe2x88x92(n1var1+n2var2) less than xcex1
wherein var1 and var2 are variances of the respective picture objects and varnew is the variance of the potentially newly formed picture object.
In a further preferred embodiment of the present invention the heterogeneity of a potentially newly formed picture object is determined via a weighted difference of color mean values of the respective picture objects before and after merging as expressed by the formula
xcex94hw=xcex94mw=(n1+n2)|mnew|xe2x88x92(n1|m1|+n2|m2|) less than xcex1
wherein m1 and m2 are color mean values of the respective picture objects and mnew is the color mean value of the potentially newly formed picture object.
In a further preferred embodiment of the present invention picture regions having continuous color transitions are combined and the heterogeneity of the potentially newly formed picture object is defined as one of the average distance and the average value of the squares of the distances of the color mean values of the respective picture objects depending on the dimensionality of the topical space in relation to a regression line, surface or hypersurface of the color mean values in the topical space or in relation to another function of approximation to the color mean values in the topical space.
In a further preferred embodiment of the present invention after step (c) the steps of (d) determining if a new tolerance is selected; and (e) if a new tolerance is selected then repeating the method for segmentation and returning to step (a) to thereby form a hierarchical structure of picture objects having different hierarchy planes, are performed.
In a further preferred embodiment of the present invention on a lowest hierarchy plane of the hierarchical structure picture elements are located, which are then at least on a next higher hierarchy plane of the hierarchical structure merged into over-picture objects, which may in turn be merged once or several times into over-picture objects on higher hierarchy planes of the hierarchical structure.
In a further preferred embodiment of the present invention the highest hierarchy plane of the hierarchical structure contains only one picture object.
In a further preferred embodiment of the present invention a new hierarchy plane is introduced in the hierarchical structure by initially duplicating all picture objects on a next lower hierarchy plane and inserting these picture objects in the new hierarchy plane as respective over-picture objects of the picture objects on the next lower hierarchy plane, wherein the picture objects on the new hierarchy plane which are determined to be conforming are merged in such a manner that only picture objects are merged which do not have different over-picture objects on a next higher hierarchy plane.
In a further preferred embodiment of the present invention an order in which picture objects of a plane are processed is an order which ensures a maximum possible distance from already processed picture objects and is a pseudo-stochastic order wherein in multiple, repeated runs on a hierarchical plane within one run each picture object present at the beginning of the run is processed once at the most by merging.
In a further preferred embodiment of the present invention the method further comprises the steps of (f) determining if one of one and several features of already merged picture objects are still conforming or not still conforming based on the specific homogeneity criterion; and (g) if one of one feature and several features of the already merged picture objects are determined not to be still conforming then excluding the not still conforming picture objects.
In a further preferred embodiment of the present invention steps (f) and (g) are performed in addition to steps (a) and (b) in an arbitrary order or in parallel.
In a further preferred embodiment of the present invention the method further comprises the steps of (h) determining if a boundary picture object of already merged picture objects located at a boundary of the already merged picture objects satisfies the homogeneity criterion with the already merged picture objects as well as with one of one and several contiguous picture objects; and (i) if the homogeneity criterion is satisfied with the already merged picture objects as well as with the one of one and several contiguous picture objects, allocating the boundary picture object to the picture object with which the homogeneity criterion is satisfied best.
In a further preferred embodiment of the present invention steps (h) and (i) are performed in addition to at least one of steps (a) and (b) and steps (f) and (g) in an arbitrary order or in parallel.
In a further preferred embodiment of the present invention if the homogeneity criterion is satisfied with the already merged picture objects as well as with the one of one and several contiguous picture objects, feature distributions of the already merged picture objects and the one of one and several contiguous picture objects are calculated and based thereon, a pertinence of the boundary picture object is determined in such a manner that the boundary picture object is allocated to the one picture object wherein a feature value of the boundary picture object occurs most frequently or that the boundary picture object is allocated to a picture object in a probabilistic manner by calculating probabilities based on a frequency of occurrence of the feature value of the boundary picture object in the already merged picture objects and the one of one and several contiguous picture objects.
In a further preferred embodiment of the present invention for the feature distributions respective histograms of the features of the already merged picture objects and the one of one and several contiguous picture objects are referred to.
In a further preferred embodiment of the present invention if a boundary picture object is re-grouped from one picture object into another picture object, coherence of the one picture object is examined and in case of non-coherence of the one picture object the one picture object is divided into corresponding coherent picture objects formed by re-grouping.
In a further preferred embodiment of the present invention the digital picture comprises a plurality of single channels having a different information content and boundary picture objects are regrouped from one picture object into another picture object only if the pertinence to the other picture object averaged through all single channels is greater than the pertinence to the one picture object averaged through all single channels.
In a further preferred embodiment of the present invention in merging or boundary correction object-related various homogeneity criteria are employed in accordance with specific features of the picture objects comprising compactness, size, boundary roughness, linearity, gradient of color development, and their classification.
In a further preferred embodiment of the present invention merging is performed only if a feature difference defined in the homogeneity criterion for one of the picture objects is smallest in comparison with the contiguous picture objects and contained within the predetermined tolerance.
In a further preferred embodiment of the present invention merging is performed only if for two picture objects a feature difference defined in the homogeneity criterion is smallest in comparison with the other contiguous picture objects and is contained within the predetermined tolerance.
In a further preferred embodiment of the present invention merging is performed only if a feature difference defined in the homogeneity criterion is smallest in comparison with all other possible combinations of picture objects and is contained within the predetermined tolerance.
In a further preferred embodiment of the present invention picture objects are processed in a pseudo-stochastic order.
In a further preferred embodiment of the present invention picture objects are processed in an order which ensures maximal possible distance from already processed picture objects.
In a further preferred embodiment of the present invention several picture objects are processed simultaneously.
In a further preferred embodiment of the present invention the digital picture comprises a plurality of single channels having a different information content and picture objects are combined only if the homogeneity criterion referred to is satisfied for each one of the channels.
In a further preferred embodiment of the present invention the digital picture comprises a plurality of single channels having a different information content and wherein for the homogeneity criterion standard deviations, variances or mean values of color values of picture objects are added up or averaged through all channels, wherein the channels may be weighted differently.
In a further preferred embodiment of the present invention boundary picture objects are not referred to for determination of properties of large picture objects.
In a further preferred embodiment of the present invention the homogeneity criterion comprises texture features.
In a further preferred embodiment of the present invention a segmentation of lines is performed in which a value of linearity indicating a ratio of length and width of a picture object is used as a shape feature and picture objects are initially segmented such that each picture object exceeding a threshold for the value of linearity is processed as a line object.
In a further preferred embodiment of the present invention processing of a picture object as a line object comprises the steps of searching for picture objects matching picture objects complementing a line in a direction of line ends beyond an immediate vicinity as far as a specific distance and in a sector having a specific angle; determining a factor which improves a matching value determined in the homogeneity criterion depending on how well a line is complemented by a previous line object, so that in case of an identical matching value linear combination is preferred to the usual combination; determining criteria for how well a line is complemented, such as an improvement of the linearity of the previous line object, in addition to at least one of identical color contrast with the surroundings and identical color; and establishing a connection having a minimum possible thickness between non-contiguous picture objects such that the line object is in any case diagonally coherent, if a matching picture object is not found in the immediate vicinity.
In a further preferred embodiment of the present invention upon additionally performing linear segmentation, segmentation is initially performed with diagonal vicinity, wherein small and linear picture objects have diagonal coherence and all other picture objects have planar coherence, wherein, if a hitherto small, diagonally picture object exceeds a critical size or if a hitherto linear picture object drops below a critical linearity by area merging, it is divided into its components having planar coherence.
In accordance with a second aspect of the present invention there is provided a method for segmentation of a digital picture consisting of a multiplicity of single picture elements comprising (a) determining if one of one and several features relating to contiguous picture objects comprising picture elements and picture segments are conforming or not conforming based on a specific homogeneity criterion by means of determining a modification of the contiguous picture objects in question as a continuation criterion which leads to a minimum increase in a defined value for the complexity of an entire structure consisting of all picture objects; (b) if one of one feature and several features relating to the contiguous picture objects are determined to be conforming then combining the conforming picture objects; and (c) repeating the resulting segmentation until the resulting segmentation converges in a stable or approximately stable condition in which no further contiguous picture objects are determined to be conforming.
In a further preferred embodiment of the present invention by means of repeating the resulting segmentation a hierarchical structure is formed having several hierarchical planes which are present in a locally different hierarchical depth.
In a further preferred embodiment of the present invention a highest hierarchical plane consists of a single picture object containing all picture elements.
In a further preferred embodiment of the present invention respective modifications are performed on respective highest local hierarchical planes of the hierarchical structure.
In a further preferred embodiment of the present invention the modifications comprise at least one of merging two contiguous picture objects, exclusion of a picture object from another picture object, allocating a boundary picture object located at the boundary of already merged picture objects to another contiguous picture object and founding a new picture object on a next higher local hierarchical plane to be formed.
In a further preferred embodiment of the present invention the defined value for the complexity is defined as one of the sum through all picture objects in the hierarchical structure of standard deviations of color mean values of objects multiplied by the number of picture objects on a next lower hierarchical plane in each picture object   C  =                                                i                ⁢                  xe2x80x83                ⁢                  σ                                    xe2x80x83                        ⁢                          U              ⁢                              xe2x80x83                            ⁢              dir                                            ⁢          n              i                  U          ⁢                      xe2x80x83                    ⁢          dir                    
wherein C is the defined value for the complexity, "sgr"iUdir is the standard deviation of the color mean value of a respective picture object i, and niUdir is the number of the picture objects on the next lower hierarchical plane in each picture object, and the sum through all picture objects in the hierarchical structure of variances of color mean values of the picture objects on the next lower hierarchical plane in each picture object multiplied by the number of picture objects on the next lower hierarchical plane in each picture object   C  =                                                i                ⁢                  xe2x80x83                ⁢                  var                      U            ⁢                          xe2x80x83                        ⁢            dir                                ⁢          n              i                  U          ⁢                      xe2x80x83                    ⁢          dir                    
wherein C is the defined value for the complexity, variUdir is the variance of the color mean value of a respective picture object i, and niUdir is the number of the picture objects on the next lower hierarchical plane in each picture object.