Computer software can be used to recognize digital representations of objects. For example, optical character recognition software can be used to recognize digital representations of character objects, typically obtained by scanning a printed page, segmenting the page into characters, and identifying characteristics of each character. Rules are used to narrow the choice of characters to a smaller range of characters, and a confidence level is assigned to each character in the smaller range. The character with the highest confidence level may be selected as the recognized character.
Some computer software for object recognition uses initial conditions for the recognition. The use of initial conditions allows the software to be tuned in a laboratory to particular conditions simulating the environment of anticipated operation of the software. Before the software is shipped as part of a product, the initial conditions are fixed at a constant level that yielded the optimum recognition in the laboratory simulation for that product.
For example, an initial condition may be that if a segment of a page believed to correspond to a character is 30 percent black, it is most likely an ‘o’ or an ‘e’, and likely not a ‘c’. Conventional pattern matching or other techniques may then be employed to identify the character. Using the initial conditions, the algorithm can start by attempting to identify if the segment corresponds to one of the most likely characters and if a threshold recognition confidence level is achieved, the user of the techniques need not attempt to compare the confidence level of additional characters, saving time in the recognition process.
It would be desirable to have the initial condition selection process vary for each set of objects, such as characters on the page, rather than selecting a single set of initial conditions and using that same set for all objects. This would allow the initial conditions to change for every page or part of a page, causing the initial conditions to be optimized for every circumstance. In the example above, different fonts or styles (e.g. bold, italics, etc.) could have different ideal values for initial conditions. As fonts change across the page, the initial conditions would ideally change to match the fonts.
While it is possible to make several attempts at recognizing the objects, such as characters in the file, using different initial conditions for each attempt, and then selecting the attempt that yields the highest recognition confidence, such a process would add too much time to the recognition process to be practical. Although computing power increases every year, because users prefer to use the additional computing power to process images of higher resolution rather than improve the accuracy of the recognition, making several attempts at recognizing an image could take too long to be useful.
What is needed is a method and apparatus that can optimally set the initial conditions of an optical recognition without significantly adding time to the recognition.