Digital images having depicted therein an object inclusive of documents such as a letter, a check, a bill, an invoice, etc. have conventionally been captured and processed using a scanner or multifunction peripheral coupled to a computer workstation such as a laptop or desktop computer. Methods and systems capable of performing such capture and processing are well known in the art and well adapted to the tasks for which they are employed.
However, in an era where day-to-day activities, computing, and business are increasingly performed using mobile devices, it would be greatly beneficial to provide analogous document capture and processing systems and methods for deployment and use on mobile platforms, such as smart phones, digital cameras, tablet computers, etc.
Traditionally, digital images have been a valuable resource of data for a nearly infinite variety of applications. In a business context, digital images have been extensively utilized for communicating and processing information, typically represented in documents and/or associated image data (such as a digital image of a vehicle associated with a digital image of an insurance claim, vehicle registration, bill of sale, etc.). Increasingly powerful mobile devices offer opportunities to expand digital image processing into the mobile arena and provide improved capability to capture and process digital image data in real-time using mobile technology.
Conventional data extraction methods for use in existing mobile and non-mobile devices rely on object templates, typically generated and/or curated by expert users, to provide information to an extraction engine instructing the engine where to locate information for extraction. In the particular case of documents, the conventional extraction technology is provided the location of one or more (typically rectangular) regions of a document, instructed to perform optical character recognition (OCR) on the region(s), and then output the determined characters to another process or file.
While the conventional extraction engines and methods are capable of reliably extracting information from objects for which a template has been previously defined, it is not possible to dynamically extract information from objects about which no template exists. This is an undesirable limitation that restricts users from using powerful extraction technology on an increasingly diverse array of documents encountered in the modern world.
Furthermore, conventional extraction engines require extensive input from expert curators to define templates and maintain template definitions as object classes evolve. The performance of template-based extraction is thus a direct function of the curators' ability to properly define templates and the curators' determination of which information is “worth” extracting. Therefore, expert curators serve as an undesirable bottleneck on the robustness of data extraction in terms of extraction accuracy and precision, as well as the scope of objects from which data may be extracted.
Further still, conventional extraction methods rely primarily or exclusively on OCR techniques to extract text characters from image data. The OCR engine is forced to make estimates regarding the identity of text characters, which inevitably leads to erroneous reporting of characters when image quality is poor, when characters do not match a predefined set of “known” characters, when an apparent character appears ambiguous such that the OCR engine cannot reliably distinguish between multiple candidate characters (e.g. a period “.” versus a comma “,” or a letter “l” versus a numeral “1”). Expert curators can mitigate these problems by urging the OCR engine toward the correct decision when certain known patterns of characters are expected, but even this mitigation is limited in scope and errors ultimately require undesirable end-user interaction.
Therefore, it would be highly beneficial to provide new method, system and/or computer program product technology for extracting information from digital image data using mobile devices. It would be further beneficial to enable extraction of information without relying on templates, thus circumventing the need for expert curators and enabling users to dynamically generate and modify extraction models to extract data from diverse and mutable classes of objects. It would be still further beneficial to provide methods for extracting data without relying on OCR techniques to overcome the limitations of predefined character classes, poor image quality, and reduce or remove the need for user correction of OCR mistakes.