1. Field of the Invention
The present invention relates to classification and search of figures, and, in particular, to a classification and a search using structural features of contours of figures.
2. Description of the Related Art
Because a contour includes all the information of a figure or an object, various methods for classifying (recognizing) of closed contour using models have been devised. In an ordinary algorithm, for a particular model, a mutually independent data-structure model is considered. However, when the number of models increases, efficiency of recognition decreases. Therefore, a method of xe2x80x98structural indexingxe2x80x99 has been considered.
A basic concept of this structural indexing is as follows: For a set of models, one large-scale data structure (table) is prepared; discrete features (structural features) obtained from model figures are used as indexes; and the models are dispersed and disposed in the large-scale data structure. Then, when an input figure is recognized (classified), the features obtained from the input figure are compared with the table. Then, by voting on respective models, models having features the same as those of the input figure are obtained as a result of a narrowing-down operation. As an example of such a data structure, there is a large-scale table in which indexes are calculated from features, and entries of the table corresponding to the thus-obtained indexes are lists of identifiers of models having the features.
In such a structural indexing, for a complex figure, the first candidate obtained from a simple voting is not necessarily a correct answer. When a complex figure is recognized, a small number (several percent to ten percent of the total number of models) of candidate models are obtained as a result of the narrowing-down operation. Then, another complicated algorithm should be applied to the candidate models. Accordingly, conditions required for the structural indexing are that a correct model be surely included in the candidate models obtained from the narrowing-down operation as a result of voting, and that the time required for this process be so short as to be ignored in comparison to the time required for the complicated algorithm which is performed on the narrowed-down candidate models, that is, the speed of this process should be very high. When these two conditions are fulfilled, it is possible to increase the speed of the figure recognition, without degrading recognition accuracy, to several ten times the speed of the conventional figure recognition, through the structural indexing. These two conditions are essential conditions which determine performance and efficiency of figure search using a large-scale figure database.
Such a problem has been considered in earnest recently because capacities of a disk and a memory of a computer are increasing. For example, Baird considered an application to printed character recognition, and devised a method in which one bit is assigned to each feature obtained from a particular model (character figure), and, thereby, a figure is represented by a series of bits as to whether or not the figure has the features (Henry S. Baird, xe2x80x9cFeature Identification for Hybrid Structural/Statistical Pattern Classification,xe2x80x9d Computer Vision, Graphics, and Image Processing, vol. 42, pages 318-333, 1988). This method can be applied when a large amount of training data is present for each character. Structural features such as an arc, a stroke, a crossing, a hole, and an end point are expressed by parameters for each type, clustering is performed on the parameter space, and one bit is assigned to one cluster. When an input figure has a feature corresponding to a cluster, xe2x80x9c1xe2x80x9d is set for the bit of the cluster. The input figure is classified as a result of classification of thus-formed large-scale dimensions of a bit vector.
However, when only one sample figure can be used for one model figure, this method of Baird cannot be applied. A weak point of the method using structural features is that the features change due to noise or deformation. In the method of Baird, a large number of sample patterns are prepared for one model, and, through statistical and inductive learning using the data, various changes in features which can actually occur are dealt with.
In order to deal with a case where one sample figure can be used for one model, Stein and Medioni devised a method in which a figure contour is approximated by a polygon with various permissible errors, and, then, from a thus-obtained plurality of figures, structural features of the figure are extracted (F. Stein and G. Medioni, xe2x80x9cStructural Indexing: Efficient 2-D Object Recognition,xe2x80x9d IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 14, no. 12, pages 1198-1204, 1992). Further, Del Bimbo and P. Pala devised a method in which a scale space method in which a curve is smoothed through Gaussian functions having various extents and features of concavity/convexity are extracted, and the structural indexing is integrated (A. Del Bimbo and P. Pala, xe2x80x9cImage Indexing using Shape-based Visual Features,xe2x80x9d Proceedings of 13th International Conference on Pattern Recognition, Vienna, Austria, August 1996, vol. C, pages 351-355). In such methods, based on a description of a structural feature obtained from each scale, a hierarchical data structure is formed considering noise and a local deformation of a contour. However, in the scale space method, a contour is smoothed through Gaussian functions having various extents. As a result, an amount of calculation increases, and, also, a process of finding correspondence of features between adjacent different scales should be performed.
In such a situation in which only one sample can be used for one model, an efficient and effective method for structural indexing which is robust for noise and/or local deformation which changes structural features has not been devised.
Further, various methods for matching (determining correspondence) between two closed contours and classification (recognition) have been devised. However, an actual input figure includes noise and/or global/local deformation. Therefore, when designing algorithm of matching, prerequisites as to what deformation and conversion can be permitted and also how much deformation and conversion can be permitted should be determined. A method in which a change in size cannot be permitted and a method in which rotation cannot be permitted were devised. Further, there are many methods in which only a xe2x80x98rigid bodyxe2x80x99 which neither expands nor shrinks is considered.
Especially, when affine transformation is considered as a general transformation, it is known that Fourier features and moment features are features which do not change through affine transformation. However, when a shape is classified using these features, high-degree coefficients (the coefficients in Fourier series expansion or the degree number of the moment) are needed. As a result, a large amount of calculation is needed. Further, these high-degree coefficients are sensitive to noise, and only low-degree coefficients are stable. Further, in the methods using these features, a figure is classified merely using coefficient parameters. Therefore, specific point correspondence between figures and transformation parameters cannot be obtained.
In order to eliminate the above-mentioned problems, a method was devised in which a figure is described by a series of features on the contour, and optimum correspondence between two contour feature series is considered. A typical example thereof is a method in which a figure is expressed by concavity/convexity features of the contour. This method has been applied to figure recognition including character recognition. A concavity/convexity structure depends on noise, and, also, a scale, in which a figure is observed. As a result, such a method is called a scale space method. A method was devised in which a closed contour is smoothed through Gaussian functions having various sizes of supports, and a point of inflection which is a point of change in concavity/convexity is extracted (N. Ueda and S. Suzuki, xe2x80x9cLearning Visual Models from Shape Contours Using Multiscale Convex/Concave Structure Matchingxe2x80x9d, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 4, pages 337-352, April 1993, and F. Mokhtarian, xe2x80x9cSilhouette-based Isolated Object Recognition through Curvature-Scale Space,xe2x80x9d IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 17, no. 5, pages 539-544, May 1995).
An advantage of this method is that, as a result of a curve being represented by features of xe2x80x98points of inflectionxe2x80x99, it is possible to use a compact structure expression obtained as a result of information compression. However, in this method, it is not clear what deformation and/or transformation can be permitted. Therefore, when this method is actually used, it is not clear to what points attention should be paid. Further, in a calculation of curvatures from an actual digital image, it is necessary to cause a series of digital points to be approximated by a curve by using a Spline curve and/or a Bezier curve. Thus, a large amount of calculation is needed. Further, in the scale space method, a curve is smoothed through Gaussian functions having various supports (indicating scales), and, then, correspondence of points of inflection between different scales should be found. However, this process requires a large amount of calculation, and, also, there is a large possibility that an error occurs in the process in which correspondence is determined.
Other than the scale space method, a method was devised in which a contour is approximated by a polygon (approximated by dividing line segments), changes in angles of a series of line segments are used as features, the contour is described by a series of characters indicating angle changes, and an approximate string matching method is applied (H. Bunke and U. Buehler, xe2x80x9cApplications of Approximate String Matching to 2D Shape Recognition,xe2x80x9d Pattern Recognition, vol. 26, no. 12, pages 1797-1812, December 1993). However, in this method, because the line segments are used to approximate a contour, information is not sufficiently compressed, and a reduction of an amount of processing is not considered. Further, what is a permissible transformation is not clear, and an editing cost, which is a reference of the approximate string matching, does not necessarily indicate a degree of similarity between figures well. As a result, the editing cost is not suitable as a reference of classification or recognition.
The present invention has been devised in order to solve the above-mentioned problems. An object of the present invention is to provide a method and a system for figure classification or search which is robust for noise and/or local/general deformation such as that of changing structural features of a figure. Another object of the present invention is to provide a method and system for figure classification or search through efficient and high-speed structural indexing which is robust for noise and/or deformation, in a case where only one sample can be used for one model for a large amount of model figures. Another object of the present invention is to provide, in order to achieve such a method and a system for figure classification or search which is robust for noise and/or deformation, a feature extracting method, a method for producing a table for figure classification, a method for evaluating a degree of similarity or a degree of difference between figures, a method for determining correspondence of structural features of a contour, a figure normalizing method and so forth.
In a pattern extracting method in which a closed contour of a figure (or a line figure itself) is approximated by a polygon, and consecutive line segments of the thus-polygon-approximated contour are integrated into higher-level structural features based on quantized directional features and quasi-concavity/convexity structures of the polygon-approximated contour, it is possible to extract compact structural features through a relatively simple processing. However, the structural features obtained in such a method change due to noise or a local concavity/convexity-structure change due to a scale. Therefore, in a case where only one sample can be used for one model, when figure classification and/or search is performed simply as a result of comparing structural features of a model figure and structural features of an input figure, a correct result may not be obtained due to noise and/or local deformation.
According to the present invention, instead of actually deforming an input figure, structural features, which may be extracted from deformed figures resulting from noise and/or local deformation, are produced from structural features extracted from the input figure, as a result of applying a predetermined rule of feature transformation concerning figure deformation. These respective extracted structural features and produced structural features are used for figure classification or search.
In the structural indexing, a predetermined transformation rule is applied to structural features extracted from a model figure, and, thus, structural features which may be extracted from deformed figures resulting from noise and/or local deformation are produced from the structural features extracted from the model figure. Then, a large-scale table (referred to as a classification table) is prepared in which each index indicates the structural features and an entry corresponding to the index is a list of identifiers of model figures, each having these structural features. The indexes are calculated from the respective structural features extracted from each model figure and produced as a result of application of the transformation rule to the structural features extracted from the model figure. Then, in the entries corresponding to the indexes, the lists of identifiers of the model figures having these structural features are stored, respectively. When figure classification or search is performed, similarly, structural features which may be extracted from deformed figures resulting from noise and/or local deformation are produced from structural features extracted from an input figure. Then, indexes are calculated from the respective structural features extracted from the input figure and produced as a result of application of the transformation rule to the structural features extracted from the input figure. Then, the model identifier lists corresponding to the indexes are referred to and the model identifiers included in the model identifier lists are voted on. Then, the model identifiers each having a large number of votes are determined as candidates. The thus-determined model identifiers are used as search keys for searching a figure database.
In these arrangement, it is possible to perform figure classification or search which is robust for noise or deformation of a figure. Further, because production of the structural features of deformed figures are performed through a relatively small amount of calculation, it is possible to perform efficient figure classification or search.
Further, in a case where only one sample can be used for one model, it is possible to perform efficient and high-speed figure classification or search, through structural indexing, which is robust for noise and deformation. Further, it is possible to easily perform figure classification or search, robust for noise and deformation of a figure, through a computer.
Further, it is possible to obtain structural features considering noise and deformation of a figure through a relatively small amount of calculation. Further, when only one sample can be used for one model, it is possible to produce a table, through a relatively small amount of calculation, which enables figure classification, using structural indexing, robust for noise and deformation of a figure.
Another aspect of the present invention relates to a method for extracting structural features of a contour of a figure, and has been devised in order to solve problems of an amount of calculation for a curve approximation or multiple scales, and to obtain a compact expression having a small number of elements. According to the other aspect of the present invention, the contour of a figure is approximated by dividing line segments (approximated by a polygon), and, based on quantized directional features and quasi-concavity/convexity structures, a plurality of consecutive line segments are integrated into higher-level structural features. Further, in order to respond to local/general change in concavity/convexity structures of the contour of the figure occurring due to noise and/or scale, based on an editing cost required for causing the structural features of the contours of the two figures, for example, an input figure and a previously prepared model figure, to approach one another, an appropriate corresponding relationship between the structural features of the contours of the two figures is obtained. Then, a degree of similarity or a degree of difference between the two figures is expressed more geometrically. Then, as a reference for figure classification (recognition), in order to obtain a distance scale relating to characteristics which the figures inherently have, in accordance with the thus-obtained correspondence relationship, normalization of one of the two figures is performed such that the contour of the one of the two figures approaches the contour of the other of the two figures, as a result of geometric transformation such as affine transformation being performed on the one of the two figures. Then, a geometric distance between the normalized figure and the other figure is calculated. Then, using the thus-calculated distance, a degree of similarity or a degree of difference between the two figures is evaluated. When figure classification is performed, a similar distance calculation is performed between the input figure and each model figure, and at least one model figure having the short distance is selected as a classification candidate.
In these arrangements, it is possible to perform efficient figure classification robust for change in a figure and change in structural features due to noise and/or global/local transformation. Further, it is possible to easily perform such figure classification using a computer.
Further., it is possible to perform appropriate evaluation of a degree of similarity or a degree of difference between two figures even when deformation of the figure and change in the structural features thereof occur. Further, processing for this evaluation can be performed in the feature space, through a relatively small amount of calculation. Further, even when deformation of the figure and change in the structural features thereof occur, through efficient processing in the feature space, appropriate determination of correspondence between the two figures or normalization of the figure can be performed.
Other objects and further features of the present invention will become more apparent from the following detailed description when read in conjunction with the accompanying drawings.