This invention relates to an image processing method and, more particularly, to a method of identifying a bitmap image.
In the past, as a part of document identification and character recognition processing, the following schemes have been adopted for specifying a position of a character frame or the like on an image:
(1) a first scheme using a page mark or a reference mark as a base (datum);
(2) a second scheme using an edge of a document as a base, provided that a scanner has a function for detecting such an edge against a black background of the present document; and
(3) a third scheme detecting black character frames individually and matching these frames with predefined frames.
However, the first scheme (1) requires to provision page marks or reference marks on a document and, thus, it has limitations such as narrowing areas available to a user or the like. Also, this scheme has a problem that it cannot handle a document without page marks or reference marks thereon.
The second scheme (2) has a problem in that it cannot be applied in the absence of an expensive scanner having the above function dedicated to OCR use.
The third scheme (3) requires extraction of not only horizontal line segments but also vertical line segments as a feature for detecting character frames and, thus, it is subjected to a degraded processing speed.
Also, this scheme requires as its preprocessing to perform skew correction of an image itself for removing any skew, which in turn leads to further degradation of processing. This is significantly worsened in a scheme that uses content of a document for identification purposes, since its analytical logic tends to be complex where there exists any skew or positional deviation.
It is, therefore, an object of this invention to solve the problems by enabling designation of a character frame and recognition of a character even where a document does not have any page mark or reference mark nor does a scanner have a function for detecting an edge of the present document.
It is another object of this invention to enable identification processing of a bitmap image in an accerelated manner by comparing bitmap images on the basis of a circumscribed rectangle, which is formed solely from horizontal line segments that are recognizable at high-speed.
It is another object of this invention to enable identification processing of a bitmap image in an accerelated manner by mapping an image to an ideal image without performing skew correction of the image itself.
It is another object of this invention to simplify involved logic by treating four corners of a circumscribed rectangle as virtual page marks such that an existing logic designed to detect a character frame on the basis of a conventional page mark may be diverted to detection of the virtual page marks.
It is another object of this invention to reduce burdens of an operator who is to create a document definition set by making it possible to add definition information of a circumscribed rectangle or horizontal line segments to an existing document definition set such that information of the conventional document definition set may be utilized as it is.
It is another object of the present invention to provide a means for solving the problems by identifying a bitmap image comprising horizontal line segments of character frames or grid lines on a document (as done in an OCR when it detects a document without page marks but with black character frames and recognizes characters). The horizontal line segments are extracted as a feature of the present document, a circumscribed rectangle is formed within an area generated by the horizontal line segments such that it is used as information for identifying an estimated basis of a character frame""s position and a class of the present document. Applying this to an OCR, it becomes possible to identify even a document without page marks or reference marks thereon. Also, comparing the extracted horizontal line segments themselves with horizontal line segments of pre-registered document definition sets to determine similarity between them, it becomes possible to identify documents in a more precise manner.
In accordance with one aspect of this invention, there is provided a method of identifying a class of a bitmap image including a plurality of horizontal line segment images, the method being carried out on a bitmap image identifying apparatus that has a bitmap image definition set containing information for identifying a plurality of bitmap images, comprising the steps of:
(a) scanning said bitmap image to extract information for specifying a plurality of horizontal line segments;
(b) extracting information for specifying a circumscribed rectangle based on said information for specifying a plurality of horizontal line segments; and
(c) checking whether or not said extracted information for specifying a circumscribed rectangle is similar to pre-registered information for specifying a circumscribed rectangle in said bitmap image definition set.
Note here that, in the claims of the present specification, the expression xe2x80x9cbitmap image definition setxe2x80x9d represents a concept corresponding with xe2x80x9cdocument definition setxe2x80x9d in a preferred embodiment of this invention, but it also covers any item that contains information for identifying classes of various bitmap images, including but not limited to documents. Also, in the claims of the present specification, the expression xe2x80x9chorizontal line segmentxe2x80x9d means a line segment that is substantially parallel to a scanning direction of a bitmap image. Further, in the claims of the present specification, the expression xe2x80x9cinformation for specifying horizontal line segmentsxe2x80x9d represents a concept covering not only coordinates of two points that define a line segment, but also vector information or the like.
In accordance with another aspect of this invention, there is provided a method of identifying a class of a bitmap image including a plurality of horizontal line segment images, the method being carried out on a bitmap image identifying apparatus that has a bitmap image definition set containing information for identifying a plurality of bitmap images, comprising the steps of:
(a) scanning said bitmap image to extract information for specifying a plurality of horizontal line segments; and
(b) checking whether or not said extracted information for specifying a plurality of horizontal line segments is similar to pre-registered information for specifying horizontal line segments in said bitmap image definition set.
In accordance with another aspect of this invention, there is provided a method of obtaining information for identifying a class of a bitmap image including a plurality of horizontal line segment images, comprising the steps of:
(a) scanning said bitmap image to extract information for specifying a plurality of horizontal line segments;
(b) extracting information for specifying a rectangle, which contains at least a portion of two of said plurality of horizontal line segments as its horizontal side, based on said information for specifying a plurality of horizontal line segments;
(c) transforming said information for specifying a rectangle based on skew information calculated from said information for specifying a plurality of horizontal line segments; and
(d) storing said transformed information for specifying a rectangle.
Note here that, in the claims of the present specification, the expression xe2x80x9crectanglexe2x80x9d represents a concept corresponding with xe2x80x9ccircumscribed rectanglexe2x80x9d in a preferred embodiment of this invention, but it also covers any rectangle, including but not limited to a circumscribed rectangle, which is formed based on particular horizontal line segments.
In accordance with another aspect of this invention, there is provided a method of recognizing a position of a character frame included in a bitmap image, comprising the steps of:
(a) scanning said bitmap image to extract information for specifying a plurality of horizontal line segments;
(b) extracting information for specifying a circumscribed rectangle based on said information for specifying a plurality of horizontal line segments;
(c) extracting information for specifying a character frame included in said bitmap image; and
(d) storing said extracted information for specifying a character frame as position information on the basis of one of vertexes of said circumscribed rectangle.
In accordance with another aspect of this invention, there is provided a bitmap image identifying apparatus for identifying a class of a bitmap image including a plurality of horizontal line segment images, comprising:
(a) image input means for containing said bitmap image including a plurality of horizontal line segments;
(b) a bitmap image definition set including information for identifying classes of a plurality of bitmap images; and
(c) image analysis means (c1) for scanning said bitmap image to extract information for specifying a plurality of horizontal line segments, (c2) for extracting information for specifying a circumscribed rectangle based on said information for specifying a plurality of horizontal line segments, and (c3) for checking whether or not said extracted information for specifying a circumscribed rectangle is similar to pre-registered information for specifying a circumscribed rectangle in said bitmap image definition set.
In accordance with another aspect of this invention, there is provided a bitmap image identifying apparatus for identifying a class of a bitmap image including a plurality of horizontal line segment images, comprising:
(a) image input means for containing said bitmap image including a plurality of horizontal line segments;
(b) a bitmap image definition set including information for identifying classes of a plurality of bitmap images; and
(c) image analysis means (c1) for scanning said bitmap image to extract information for specifying a plurality of horizontal line segments, and (c2) for checking whether or not said extracted information for specifying a plurality of horizontal line segments is similar to pre-registered information for specifying horizontal line segments in said bitmap image definition set.
In accordance with another aspect of this invention, there is provided a bitmap image processing apparatus for obtaining information for identifying a class of a bitmap image including a plurality of horizontal line segment images, comprising:
(a) image analysis means (a1) for scanning said bitmap image to extract information for specifying a plurality of horizontal line segments, (a2) for extracting information for specifying a rectangle, which contains at least a portion of two of said plurality of horizontal line segments as its horizontal side, based on said information for specifying a plurality of horizontal line segments, and (a3) for transforming said information for specifying a rectangle based on skew information calculated from said information for specifying a plurality of horizontal line segments; and
(b) a bitmap image definition set for storing said transformed information for specifying a rectangle.
In accordance with another aspect of this invention, there is provided a bitmap image processing apparatus for recognizing a position of a character frame included in a bitmap image, comprising:
(a) image analysis means (a1) for scanning said bitmap image to extract information for specifying a plurality of horizontal line segments, (a2) for extracting information for specifying a circumscribed rectangle based on said information for specifying a plurality of horizontal line segments, and (a3) for extracting information for specifying a character frame included in said bitmap image; and
(b) a bitmap image definition set for storing said extracted information for specifying a character frame as position information on the basis of one of vertexes of said circumscribed rectangle.
In accordance with another aspect of this invention, there is provided a storage medium for storing an image processing program executable on a bitmap image identifying apparatus that has a bitmap image definition set containing information for identifying a plurality of bitmap images, said image processing program identifying a class of a bitmap image including a plurality of horizontal line segment images, said image processing program comprising:
(a) program code for indicating said bitmap image identifying apparatus to scan said bitmap image to extract information for specifying a plurality of horizontal line segments;
(b) program code for indicating said bitmap image identifying apparatus to extract information for specifying a circumscribed rectangle based on said information for specifying a plurality of horizontal line segments; and
(c) program code for indicating said bitmap image identifying apparatus to check whether or not said extracted information for specifying a circumscribed rectangle is similar to pre-registered information for specifying a circumscribed rectangle in said bitmap image definition set.
In accordance with another aspect of this invention, there is provided a storage medium for storing an image processing program executable on a bitmap image identifying apparatus that has a bitmap image definition set containing information for identifying a plurality of bitmap images, said image processing program identifying a class of a bitmap image including a plurality of horizontal line segment images, said image processing program comprising:
(a) program code for indicating said bitmap image identifying apparatus to scan said bitmap image to extract information for specifying a plurality of horizontal line segments; and
(b) program code for indicating said bitmap image identifying apparatus to check whether or not said extracted information for specifying a plurality of horizontal line segments is similar to pre-registered information for specifying horizontal line segments in said bitmap image definition set.
In accordance with another aspect of this invention, there is provided a storage medium for storing an image processing program executable on a bitmap image processing apparatus, said image processing program obtaining information for identifying a class of a bitmap image including a plurality of horizontal line segment images, said image processing program comprising:
(a) program code for indicating said bitmap image processing apparatus to scan said bitmap image to extract information for specifying a plurality of horizontal line segments;
(b) program code for indicating said bitmap image processing apparatus to extract information for specifying a rectangle, which contains at least a portion of two of said plurality of horizontal line segments as its horizontal side, based on said information for specifying a plurality of horizontal line segments;
(c) program code for indicating said bitmap image processing apparatus to transform said information for specifying a rectangle based on skew information calculated from said information for specifying a plurality of horizontal line segments; and
(d) program code for indicating said bitmap image processing apparatus to store said transformed information for specifying a rectangle.
In accordance with another aspect of this invention, there is provided a storage medium for storing an image processing program executable on a bitmap image processing apparatus, said image processing program recognizing a position of a character frame included in a bitmap image, said image processing program comprising:
(a) program code for indicating said bitmap image processing apparatus to scan said bitmap image to extract information for specifying a plurality of horizontal line segments;
(b) program code for indicating said bitmap image processing apparatus to extract information for specifying a circumscribed rectangle based on said information for specifying a plurality of horizontal line segments;
(c) program code for indicating said bitmap image processing apparatus to extract information for specifying a character frame included in said bitmap image; and
(d) program code for indicating said bitmap image processing apparatus to store said extracted information for specifying a character frame as position information on the basis of one of vertexes of said circumscribed rectangle.