Scanners for capturing images of paper documents, photographs and/or other objects including for example debit and credit cards, and converting the captured images into electronic files are well known in the art. Although such canners have a variety of general and specific uses, they are most commonly used to scan documents in order to consolidate records, create paperless work environments and/or facilitate the electronic transmission of information.
Scanners vary in design and sophistication, but generally all scanners comprise an elongate light source and a grid or series of sensors for receiving light that is reflected off of the surface of the object being scanned. The data from the sensors is collected by a processor operating under control of scanner software and stored in memory as a digital image file typically in JPEG, BMP or GIF format. If the scanner is coupled to a computer or to a local or wide area network, the digital image file is typically made available to the computer and/or to network devices for storage and/or further processing.
In some situations, during the scanning process multiple objects are placed on the scanner bed and then scanned resulting in a single image that includes the multiple objects. For example, a number of photographs may be placed on the scanner bed and then scanned to generate a single image including all of the photographs. As will be appreciated, when an image comprising multiple objects is to be further processed, it may be desired to process the image so that the individual objects in the image can be detected and separated. Not surprisingly, many techniques for detecting objects in images have considered.
For example, U.S. Pat. No. 6,335,985 to Sambonsugi et al. discloses a method in which three rectangles are set to surround three temporally continuous frames. Difference images are obtained on the basis of the inter-frame differences between the current frame and a first reference frame, and between the current frame and a second reference frame. Background regions are respectively determined for polygons, and the remaining regions are selected as object region candidates. By obtaining the intersection between the object region candidates, an object region in the current frame can be extracted.
U.S. Pat. No. 6,898,316 to Zhou discloses a method for detecting an image area in a digital image. During the method, a first image region indicative of a background area and a second image region indicative of a foreground area are identified in the image. Gradient values are computed using the pixel values of the digital image. A list of strokes based on the gradient values is defined and the list of strokes is merged. A list of corners is defined using the list of strokes and an image area rectangle delimiting the image area is defined using the list of corners and the list of strokes. The image area rectangle can be used to define a bounding box for extracting the foreground area from the digital image.
U.S. Patent Application Publication No. 2004/0146198 to Herley discloses an object detection and extraction system and method for processing digital image data. Objects contained within a single image are segregated allowing those objects to be considered as individual objects. The object detection and extraction method takes an image containing one or more objects of known shape (such as rectangular objects) and finds the number of objects along with their size, orientation and position. In particular, the object detection and extraction method classifies each pixel in an image containing one or more objects to obtain pixel classification data. An image function is defined to process the pixel classification data and the image is divided into sub-images based on disparities or gaps in the image function. Each of the sub-images is processed to determine a size and an orientation for each of the objects.
U.S. Patent Application Publication No. 2004/0181749 to Chellapilla et al. discloses a computer-implemented method and apparatus for populating an electronic form from an electronic image. The size, orientation and position of an object within the electronic image is initially identified together with information elements from pixels within the image that correspond to the object. Fields of the electronic form are displayed to a user along with the identified information elements through a graphical user interface. The information elements are parsed into tagged groups of different information types. At least some of the fields of the electronic form are populated with the tagged groups to produce a populated form. The populated fields can be edited through the graphical user interface.
U.S. Patent Application Publication No. 2004/0258313 to Jones et al. discloses a method for detecting a specific object in an image. During the method, the orientation of an arbitrary object with respect to an image plane is determined and one of a plurality of orientation and object specific classifiers is selected according to the determined orientation. The arbitrary object is classified as a specific object by the selected orientation and object specific classifier.
U.S. Patent Application Publication No. 2005/0105766 to Fesquet et al. discloses a method of detecting single postal items and multiple overlapping postal items in a postal sorting installation. During the method, images representing postal items viewed from the front are analyzed and an outline-extracting process is applied to each image in order to recognize items having an outline of substantially constant height.
U.S. Patent Application Publication No. 2005/0180632 to Aradhye et al. discloses an apparatus and a concomitant method for rectification and recognition of symbols to correct for the effects of perspective distortion, rotation and/or scale in images of three-dimensional scenes. The method locates a reference region lying in a common plane with a symbol to be recognized. The reference region represents an image of a planar object having assumed (e.g. known or standard) geometry and dimensions. At least four easily detectable correspondence points within that geometry are located. An image of the common plane is then rectified in three dimensions in accordance with the assumed dimensions of the reference region in order to produce a transformed image of the symbol.
U.S. Patent Application Publication No. 2005/0180635 to Trifonov et al. discloses a method in which a boundary in an image is determined by first identifying a search region within the image. Image gradients in the search region are determined together with multiple color regions within the search region. An active contour representing the boundary is created based on the image gradients and the multiple color regions.
Other methods of detecting objects in images are disclosed in the following non-patent literature:
“Rectangle Detection Based On A Windowed Hough Transform” authored by C. Jung et al. (Proceedings of the XVII Brazilian Symposium on Computer Graphics and Image Processing; 1530-1834; 2004);
“Automatic Particle Detection Through Efficient Hough Transforms” authored by Y. Zhu et al. (IEEE Trans. on Medical Imaging; 22(9): 1053-1062; 2003);
“Detecting Circular And Rectangular Particles Based On Geometric Feature Detection In Electron Micrographs” authored by Z. Yu et al. (Journal of Structural Biology; 145, 168-180; 2004);
“Recursive Method To Extract Rectangular Objects From Scans” authored by C. Herley (ICIP, vol. 3 no. pp. III-989-92, 14-17; 2003);
“Recursive Method To Detect And Segment Multiple Rectangular Objects In Scanned Images” authored by C. Herley (Technical report MSR-TR-2004-01, Microsoft Research; 2004); and
“Efficient Inscribing Of Noisy Rectangular Objects In Scanned Images” authored by C. Herely (ICIP, Vol. 4, 2399-2402, 24-27; 2004).
Although the references discussed above disclose various methods and systems for detecting objects in images, improvements are desired. It is therefore an object of the present invention to provide a novel method and apparatus for detecting objects in an image.