Digital images having depicted therein a document such as a letter, a check, a bill, an invoice, etc. have conventionally been captured and processed using a scanner or multifunction peripheral coupled to a computer workstation such as a laptop or desktop computer. Methods and systems capable of performing such capture and processing are well known in the art and well adapted to the tasks for which they are employed.
However, in an era where day-to-day activities, computing, and business are increasingly performed using mobile devices, it would be greatly beneficial to provide analogous document capture and processing systems and methods for deployment and use on mobile platforms, such as smart phones, digital cameras, tablet computers, etc.
A major challenge in transitioning conventional document capture and processing techniques is the limited processing power and image resolution achievable using hardware currently available in mobile devices. These limitations present a significant challenge because it is impossible or impractical to process images captured at resolutions typically much lower than achievable by a conventional scanner. As a result, conventional scanner-based processing algorithms typically perform poorly on digital images captured using a mobile device.
In addition, the limited processing and memory available on mobile devices makes conventional image processing algorithms employed for scanners prohibitively expensive in terms of computational cost. Attempting to process a conventional scanner-based image processing algorithm takes far too much time to be a practical application on modern mobile platforms.
A still further challenge is presented by the nature of mobile capture components (e.g. cameras on mobile phones, tablets, etc.). Where conventional scanners are capable of faithfully representing the physical document in a digital image, critically maintaining aspect ratio, dimensions, and shape of the physical document in the digital image, mobile capture components are frequently incapable of producing such results.
Specifically, images of documents captured by a camera present a new line of processing issues not encountered when dealing with images captured by a scanner. This is in part due to the inherent differences in the way the document image is acquired, as well as the way the devices are constructed. The way that some scanners work is to use a transport mechanism that creates a relative movement between paper and a linear array of sensors. These sensors create pixel values of the document as it moves by, and the sequence of these captured pixel values forms an image. Accordingly, there is generally a horizontal or vertical consistency up to the noise in the sensor itself, and it is the same sensor that provides all the pixels in the line.
In contrast, cameras have many more sensors in a nonlinear array, e.g., typically arranged in a rectangle. Thus, all of these individual sensors are independent, and render image data that is not typically of horizontal or vertical consistency. In addition, cameras introduce a projective effect that is a function of the angle at which the picture is taken. For example, with a linear array like in a scanner, even if the transport of the paper is not perfectly orthogonal to the alignment of sensors and some skew is introduced, there is no projective effect like in a camera. Additionally, with camera capture, nonlinear distortions may be introduced because of the camera optics.
Other major challenges unique to capturing image and/or video data utilizing a camera or array of cameras may include variable illumination conditions, for instance non-uniform lighting conditions that may generate shadows on objects depicted in images, presence of specular lights which may generate glare, etc. as would be understood by skilled artisans upon reading these disclosures.
In addition, utilizing cameras to capture image and/or video data introduces challenges with respect to distinguishing an object of interest from relatively complex backgrounds as compared to the typical background for a flat-bed scanner (which exhibits characteristics that are well-known and relatively immutable, for instance a single background texture and color for the scanner background). As a result, clustered background makes page segmentation difficult and challenging as compared to scenarios typically encountered using scanner-generated image data.
In view of the challenges presented above, it would be beneficial to provide an image capture and processing algorithm and applications thereof that compensate for and/or correct problems associated with image capture and processing using a mobile device, while maintaining a low computational cost via efficient processing methods.
Moreover, mobile devices are emerging as a major interface for engaging a wide variety of interactive processes relying on data often depicted on financial documents. A primary advantage of the mobile interface is that the documents that can be conveniently and securely imaged utilizing a mobile device. For example, the banking industry has recently witnessed a mobile revolution, with much attention gathering around new services and functionalities enabled by mobile technology, such as mobile check deposit and mobile bill payment. These applications leverage the persistent connectivity of mobile devices to provide customers and service providers unprecedented accessibility and quality of service, consequently improving resolution and accuracy of financial transaction record management, and improving security of financial transactions due to known security advantages of mobile devices.
To date, these applications have been limited in scope to simple transactions leveraging conventions and standards unique to very narrow aspects of the financial services industry. Most notably, the financial industry has been able to leverage conventions such as the universal formatting of account and routing numbers, the near-universal presence of magnetic ink character recognition (MICR) on documents utilized in financial transactions, such as checks, remittance slips, etc.
As described in U.S. Pat. Nos. 7,778,457; 7,787,695; 7,949,167; 7,953,268; 7,978,900; 8,000,514; 8,326,015; 8,379,914; 8,577,118; and/or 8,582,862 to Nepomniachtchi, et al., the conventional mobile financial services involves mobile image processing and mobile check deposit approaches that rely heavily on MICR characters. The MICR characters are used to conduct the image processing operations that are necessary to ensure adequate image quality for subsequent financial processing, such as ensuring the image is the proper size and/or orientation. The MICR characters are also used to conduct the financial processing aspects, such as routing payments/deposits to the account corresponding to the number depicted on the imaged check or remittance slip.
Reliance on such conventional standards and industry-specific practices allows high-fidelity and high-performance in the very limited scope to which those standards and practices apply, but unfortunately limit the applicability of the underlying technology to only those narrow fields. It would be of great advantage to remove the reliance on such standard information and enable broader application of mobile technology to modern image capture, processing, and business workflow integration. For example, while identity documents universally depict identifying information that is useful in a wide variety of applications, including but certainly not limited to financial transactions, it is relatively uncommon for various types of ID to conform to a universal standard for presenting this information (e.g. presenting the information in a manner analogous to the MICR characters of a check). Indeed, even the same type of ID, such as a driver's license, may depict different information, or depict similar information in a very different format, manner, and/or layout depending on the authority that issued the ID. Consider, for example, the disparity between driver licenses issued by various states, or between employee IDs according to employer, school IDs according to district, military IDs according to branch, insurance cards according to provider, etc.
Accordingly, it would be of great benefit to provide systems, techniques, and computer program products capable of leveraging mobile technology to utilize identity information depicted on IDs and integrate the imaging, capture, and processing of IDs with business workflows.
The presently described systems and techniques accordingly provide uniquely advantageous features with application beyond the narrow scope of financial transactions. The inventive concepts disclosed below also remove the limitations associated with relying on universal standards such as MICR characters that are inapplicable to IDs.