1. Field of the Invention
The invention relates to a form identification method of identifying a type of form which is read prior to processing of a form in case of reading a plurality of types of forms and a form registration method of registering the identified form. More particularly, the invention relates to a form identification method of enabling a type of form to be stably identified against the form direction, enlargement and shrinkage (or scale-up and scale-down), and skew of the form and to a form registration method.
2. Description of the Related Art
As a conventional technique regarding a form identification method, there has been known a method whereby features for identifying the type of form, for example, character codes, character lines, lines or ruled lines, cells, and the like in the form are automatically extracted, the extracted features are subjected to matching with features of the form which have previously been registered on the basis of those features, and the type of form is identified.
As a conventional technique using the lines as a feature of the form identification mentioned above, for example, the techniques disclosed in JP-A-61-59568 and the like have been known. According to such conventional techniques, the type of form is identified by analyzing a structure of the form by using horizontal and vertical lines in the form.
As a conventional technique whereby features of the form identification are expressed by point coordinates and matching is made using the point coordinates, for example, the techniques disclosed in JP-A-62-184585 and the like have been known. Such conventional techniques relate to a method of subjecting a pattern comprising point sets to matching. That is, a degree of matching is obtained to detect a similarity between two point sets and the degree of matching is obtained at a high speed on a block unit basis of blocks divided into small areas in the horizontal and vertical directions, thereby identifying the type of form. For example, the method whereby a center of a cell is set as a feature and matching is made using a hash-table in which the horizontal and vertical directions are set as bases, has been disclosed in JP-A-8-255236. Further, the method whereby a positional relation of a minimum rectangle in a form is obtained from a connecting relation of the row and column directions and an attribute of the minimum rectangle is decided has been disclosed in JP-A-2000-339406.
As a conventional technique using the position of the character line as a feature, for example, the techniques disclosed in JP-A-7-114616 and the like have been known. According to such conventional techniques, in order to identify a format of a detailed bill account of a diagnosis and treatment fee, the form is identified on the basis of a position of the extracted character line.
Further, as a conventional technique regarding the form identification of a form such that the operation to extract each rectangle in the form is unstable because of the enlargement and shrinkage (or scale-up and scale-down) of the form, a blur of the lines, or the like, for example, the technique disclosed in JP-A-2000-306030 has been known. According to this conventional technique, coordinates of a matched rectangle are set as a reference point of a rectangle to be subjected to next matching and matching is made while sequentially moving the reference point.
Hitherto, as a method of searching an image having a similar nature, for example, the technique disclosed in Yoshinori Musha and Atsushi Hiroike, “Image Laboratory”, The Japan Industrial Publishing Co., Ltd., Vol. 11, No. 9, pages 5–9, September, 2000, has been known. According to this conventional technique, feature vectors of images are extracted from the images and images near a key-image at a distance between the extracted vectors are collected, and a color feature in a three-primary color space of red, green, and blue and a differential direction feature in which lightness/darkness of a luminance image varies are used as image features. According to the conventional technique, however, nothing is considered with respect to the features of the lines, character lines, and cells which are peculiar to the form image.
As a conventional fingerprints identification method for personal identification, for example, the techniques disclosed in JP-A-2000-293688 and the like have been known. According to such conventional techniques, feature information of an inputted finger prints image and feature information of fingerprints images which have previously been stored are checked for matching and one of the feature information is rotated into a handstanding state or orthogonal state, and the fingerprints are verified. According to such conventional techniques, however, nothing is considered with respect to the features of the lines, character lines, and cells which are peculiar to the form image.
As a conventional method of detecting a rotational angle of a document, for example, the techniques disclosed in JP-A-6-103411 and the like have been known. According to such conventional techniques, the document is rotated by 0°, 90°, 180°, and 270°, character recognition is executed, respectively, and the most correct rotational angle among them is determined as a direction of the document. Such conventional techniques, however, have a problem such that in order to detect the direction of the form, the character recognition is executed at each angle, and it takes a processing time for the character recognition. Moreover, nothing is considered with respect to the identification of a type of form.
As a conventional method of also identifying a rotating direction on a unit basis of 90° (at a right angle) of a form simultaneously with the identification of the form, for example, the technique disclosed in JP-A-7-249099 has been known. According to such a conventional technique, with respect to the forms obtained by rotating the inputted form by 90°, 180°, and 270°, distributions of the lines, that is, the vertical and horizontal lines are obtained and verified with those of the vertical and horizontal lines which have previously been obtained, thereby identifying also the rotating direction on a 90° unit basis of the form which was inputted simultaneously with the identification of the form. The above conventional technique, however, has a problem such that the identification is unstable against a blur or boldface of the line which is used as a feature and, in dependence on an array of contacting characters, a false line such that character strokes are coupled appears, so that an erroneous form identification and an erroneous identification of the rotating direction are made.
As a conventional method of making a form identification by using a line type of cell lines, for example, the techniques disclosed in JP-A-11-66228 and the like have been known. According to such conventional techniques, the type of cell lines is decided and format information for reading the form is generated. As such a kind of techniques, according to the conventional technique disclosed in JP-A-11-85900, a solid line and a broken line are distinguished, thereby identifying the form and, further, the solid line and broken line are handled without being distinguished in terms of identification precision, thereby enabling the form to be identified. Such a conventional technique, however, does not disclose a process for switching so as to validate or invalidate a discrimination of whether the types of lines of every type of form and every cell of the form are used or not.
As a conventional method of enabling the form identification even if there are enlargement and shrinkage (or scale-up and scale-down) of the form, for example, the techniques disclosed in JP-A-2000-306030 and the like have been known. According to such conventional techniques, cells of adjacent forms are verified by sequentially moving the reference point, thereby preventing erroneous matching due to a location shift by the enlargement and shrinkage (or scale-up and scale-down) of the whole form. Such conventional techniques have a problem such that an error occurs in the matching of each cell in the case where the cells are dropped out and the cells cannot be extracted or a case where a false rectangle occurs. Nothing is considered with respect to a shift of the reference position. As techniques of the same kind as that mentioned above, the techniques disclosed in JP-A-2000-123174, JP-A-8-315068, JP-A-7-249099, and the like have been known. Those conventional techniques relate to a method of subjecting an interval of the lines included in a predetermined area to matching and presuming ratios of scaling (or magnifying and shrinking) of the form image from the result of the matching. However, in the matching between the lines, a problem such that if there is a dropout of the lines or a generation of false lines, the matching itself becomes wrong, so that values of the presumed ratios of scaling (or magnifying and shrinking) become erroneous is not solved. Those methods cope with the enlargement and shrinkage (or scale-up and scale-down) relying on the matching of the lines. There is, consequently, a problem such that if the matching of the lines becomes wrong, an error occurs in the enlargement and shrinkage (or scale-up and scale-down).