Paper documents are often hole-punched for purposes of facilitating storage in binders and/or folders. Punch holes, when created in paper, are usually round in shape with multiple holes normally being punched in a relatively straight line, e.g., appearing as a line of holes in a paper. Hole punching often is performed at the side or top of a page. In some cases, over time, a page may be punched multiple times, e.g., for storage in different binders and/or folders. Thus, in some documents multiple sets of holes corresponding to punch holes made to facilitate storage in a binder or folder may be found in a single document, e.g., sheet of paper.
For purposes of electronic storage, form processing, and/or for electronic document transmission, paper documents are often optically scanned and then processed. Scanning of documents often results in a set of binary, e.g., black or white, pixel values, representing the scanned image. In most cases where printed documents are scanned, the text will be black or interpreted as black with the background being white. For example, a black pixel value may be represented by a “1” pixel value and a white pixel value represented by a “0” pixel value.
As part of the scanning process one or more sheets of paper may be scanned. Unfortunately, as part of the scanning processes and/or as the result of previous copying or the original physical hole-punching performed to generate the document which is being scanned, the marks, corresponding to punch holes, which are shown in the scanned image may not appear as perfect circles. For example, physical punch holes which were originally round may appear, in the scanned image, as ovals due to skewing and/or non-uniform scaling in the horizontal and vertical dimensions during the scanning process and/or may not be perfectly round for other reasons, e.g., one or more holes were not punched through completely.
Punch holes will often appear as dark areas on a scanned image which has a white background such as when white paper with black text is scanned. Since the punch hole may have been incomplete or the scanning may not detect the hole as being all black, some portions of the area corresponding to a punch hole may appear white in the scanned image. It should be appreciated that scanned documents often include text. One problem that occurs when attempting to identify punch holes in some images is the problem of distinguishing between text characters having round or oval features and punch holes. For example, it should be appreciated that depending on the method being used to detect punch holes, it may be difficult to distinguish between a capital “O” and a mark on the scanned image resulting from the presence of a punch hole.
While a stack of papers may have holes punched in the same location, a set of pages scanned at the same time may include sheets that were hole-punched at different times, which were put in the scanner upside down, and or slightly mis-fed by the scanner resulting in punch holes appearing at different locations from one image, e.g., scanned page, to the next image, e.g., next scanned page. Thus, it should be appreciated that it is desirable to be able to identify punch holes without having to rely on the occurrence of holes in the same location in multiple scanned pages.
The detection, e.g., identification, of punch holes in a generally reliable manner is important if subsequent image processing is to occur which takes into consideration the presence of one or more punch holes.
In view of the above discussion, it should be appreciated that there is a need for method and apparatus for identifying punch holes in a scanned image and/or for performing image processing operations on an image based on information regarding detected punch holes in a scanned image.