1. Field of the Invention
The present invention relates to form recognition, and more particularly relates to an apparatus and a method for defining a form template.
2. Description of the Related Art
Form recognition has many applications in the field of collecting and analyzing information by employing forms. It is possible to digitalize, store and send handwriting or printed data by utilizing the form recognition. An example of the application of the form recognition is one used in banks where there are a lot of forms needing to be dealt with; however, types of the forms are not too many, i.e., there are a large number of forms having the same types, for example, a remittance form, a withdrawal form, etc. In this case, as long as templates of the forms can be recognized, it is possible to cause an application program to find meaningful contents in suitable areas of the forms, for example, names of users, card numbers, etc. As a result, a definition of a form template is the first step of the form recognition.
In general, the form template informs a form processing application program about where the meaningful data are able to be extracted, how to extract the meaningful data, arrangement of texts in cells of a form, how to select a proper Optical Character Recognition (OCR) engine, etc.
Form Template Definition (FTD) is mainly used to determine attributes of the cells. The attributes of the cells includes but not limited to types of languages of the texts in the cells, for example, Chinese, Japanese, etc.; whether characters or Arabic numerals are able to be input; properties of layout, for example, one line or plural lines, one character string or one numeral, etc.; properties of lines forming the cells, for example, solid lines or dotted lines, rectangle-shaped lines or U-shaped lines, etc.; and whether the texts are those mixed of, for example, Simplified Chinese and Traditional Chinese, Chinese and Japanese, characters and numerals, etc.
In U.S. Pat. No. 5,317,646, a form recognition system is disclosed, wherein, a method of assisting an operator to create an electronic template is introduced. The operator uses a pointing device to select individual points located in closed boundaries or semi-closed boundaries on a displayed image; in this way, coordinates expressing the closed boundaries or the semi-closed boundaries are automatically determined by utilizing the individual points selected by the operator. However, in this patent specification, only how to determine positions of cells in a form is discussed, i.e., attributes of the cells are not involved; aside from this, the operator needs to artificially provide one point in a process of determining the positions of the cells.
In conventional techniques, operations for determining attributes of cells in a form usually are artificially defining the attributes of all of the cells in the form one by one. That is, the amount of work of a user is too large, and there are a lot of repeated definition operations; as a result, the user may get weary very easily.