The following relates to the information arts. It especially relates to methods and apparatuses for extracting textual personal information from business cards photographed using the built-in camera of a cellular telephone, and will be described with particular reference thereto. The following relates more generally to extraction of textual personal information from images acquired by portable imagers such as digital cameras, handheld scanners, and so forth, and to acquiring personal information by using a portable imager in conjunction with text extraction techniques, and so forth.
The cellular telephone including built-in digital camera is a common device carried by business and professional persons. While having a wide range of uses, one application to which the digital camera component of cellular telephones is applied is the rapid capture of business card images. When meeting someone for the first time, or when meeting someone whose personal information has changed due to a job transfer, promotion, or so forth, it is convenient for the business or professional person to use the built-in camera of his or her cellular telephone to photograph the business card of the newly met person, thus creating a digital image of the business card. In effect, the built-in digital camera of the cellular telephone is used as a kind of portable instant document scanner. However, the photograph is in an image format, such that the textual content is not immediately accessible for input to a text-based personal contacts list or other text-based database.
Optical character recognition (OCR) software extracts textual information from images. Thus, a desirable combination is to apply OCR to extract textual information from the business card image acquired using the built-in digital camera of the cellular telephone. Once text is extracted, each text line can optionally be tagged as to data type (such as tagging text lines as “personal name”, “job title”, “entity affiliation”, or so forth), and optionally incorporated into a contacts database. In practice, however, it has been found to be difficult to effectively apply OCR to business card images acquired using digital cameras.
One problem which arises is that the resolution of the built-in digital cameras of cellular telephones is typically low. The built-in cameras of existing cellular telephones sometimes have a so-called VGA resolution corresponding to the coarse pixel density of a typical display monitor. Some existing cellular telephones have built-in cameras with higher resolution, such as around 1-2 megapixels or more. It is anticipated that the built-in camera resolution will increase as cost-per-pixel decreases. However, even with improved pixel resolution, image quality is likely to be limited by poor optics. Higher manufacturing costs of the physical optical system as compared with electronics has tended to cause manufacturers to use optics of limited quality. Lens quality is improving at a substantially slower rate than resolution, and so this aspect of typical cellphone cameras is less likely to improve substantially in the near future. Further, the trend toward more compact or thinner cellular telephones calls for miniaturized optics, which are difficult to manufacture with high optical quality. Common adverse effects of poor lenses include image noise, aberrations, artifacts and blurring. OCR tends to produce more errors and higher uncertainty under these conditions.
Additionally, the cellular telephone is held by hand, focused on the small business card, during imaging of the business card. Accordingly, unsteadiness of the camera during the photographing can produce blurring, artifacts, or other image degradation. Image acquisition is typically done in uncontrolled conditions, such as variable lighting, strong shadows, non-expert usage, variable distance to objective, variable three-dimensional viewing angle, and so forth. The acquired document image orientation often has substantial scale, skew, and/or rotation components, and may have substantial variation in illumination. In summary, the physical characteristics of the camera, non-ideal imaging environment, and the typically limited photographic skill of the operator combine such that a built-in digital camera of a cellular telephone typically acquires business card images of relatively quality with substantial image defects, which tends to lead to substantial errors and uncertainty in the OCR.
The textual content of the business card also does not lend itself to accurate OCR. In typical OCR processing, objects are recognized and identified as letters, numerals, punctuation, or other characters based on pattern matching, but with some uncertainty because the rendering of the characters is less than optimal, because the text font may vary, and so forth. To counter these difficulties, OCR processing sometimes resolve uncertainties by comparing uncertain words or phrases against an electronic dictionary or grammar checker. These approaches are relatively ineffective when applied to OCR conversion of the textual content of business cards, because the content (such as personal names, job titles, affiliations, addresses, and so forth) are typically not found in electronic dictionaries and typically do not follow conventional grammar rules. Thus, the nature of the textual content tends to lead to unresolvable errors and uncertainty in the OCR.