With the advent of technology, digitization of documents may be required in all administrative offices. After digitization of such documents, verification of data that is printed on the documents needs to be done. For example, verification of documents submitted as proofs to institutions like insurance and banking needs to be performed. Thus, character recognition systems have been used to identify the characters (printed text) in the digitized documents. In an embodiment, the digitized documents contain printed characters with different languages and different font types.
Character recognition systems traditionally may utilize feature analysis on a printed character to be recognized by tracing the boundary of the printed character to locate stops or inflection points on the printed character. The printed character features may be detected from sequence of boundary slope, vector slope and vector curve calculations to form a feature set. The set of features may be then analyzed in a sequential logic decision tree (e.g. Binary tree) to identify/recognize the printed character. The disadvantage of the traditional character recognition systems may be that the system has to identify boundary stops or inflection points on each printed character and then analyze sequence of features such as, boundary and slopes through sequential logic decision tree with node/branches at every boundary stops or inflection points. This results in more nodes or tree depth (i.e. decision iteration) and processing time. Additionally, traditional character recognition systems and methods may not be easily adaptable for multiple language characters and font types.
Further limitations and disadvantages of conventional and traditional approaches will become apparent to one of skill in the art, through comparison of described systems with some aspects of the present disclosure, as set forth in the remainder of the present application and with reference to the drawings.