1. Field of the Invention
The present invention relates to an image processing method, an image processing apparatus, and a program for performing clustering on image data.
2. Description of the Related Art
In recent years, there has been a growing demand for saving or sending documents in computerized forms instead of in paper form. The term “computerization of a document,” as used herein, is not simply limited to reading a paper document by a scanner and the like to obtain image data. For example, image data is separated into regions of different properties such as characters, diagrams, photographs, and tables which make up a document. Accordingly, a computerization process of a document involves converting a region into data of an optimal format, such as converting a character region into character codes, converting a diagram region into vector data, converting a background region or a photograph region into bitmap data, and converting a table region into structural data.
As an example of a method of conversion to vector data, Japanese Patent Laid-Open No. 2007-158725 discloses an image processing apparatus. With the disclosed image processing apparatus, region splitting is performed by clustering, contours of respective regions are extracted, and the extracted contours are converted into vector data.
Known methods of splitting an image into regions by clustering include a nearest-neighbor clustering method and a K-means clustering method.
The nearest-neighbor clustering method involves comparing a feature vector of a processing object pixel with representative feature vectors of respective clusters to find a cluster having a feature vector whose distance therefrom is a minimum. If the distance is less than or equal to a predetermined threshold, the processing object pixel is allocated to the corresponding cluster. If not, a new cluster is defined and the processing object pixel is allocated to the new cluster. Color information (pixel values comprising R, G, and B) is generally used as feature vectors. A center of gravity of a cluster is generally used as a representative feature vector of the cluster. In other words, a representative feature vector of a cluster is a mean value of feature vectors (color information) of the respective pixels allocated to the cluster.
The K-means clustering method involves defining, in advance, K-number of clusters and representative feature vectors thereof, and allocating each pixel to a cluster whose distance from a feature vector is a minimum. After processing on all pixels is completed, the representative feature vector of each cluster is updated. The above processing operation is repeated until the differences in representative feature vectors before and after the update equals or falls below a predetermined value.
Both the nearest-neighbor clustering method and the K-means clustering method involve a process for finding a cluster having a representative feature vector whose distance from a feature vector of a processing object pixel is a minimum from among all clusters. In other words, distances from the representative feature vectors of all clusters must be calculated for each pixel. Consequently, calculation time problematically increases when the number of clusters is increased in order to improve region-splitting accuracy.
As a conventional technique for solving the problem described above, for example, Japanese Patent Laid-Open No. 11-288465 discloses a color image processing apparatus. With the conventional technique, clustering is performed based on feature vectors (color information) of a processing object pixel and an adjacent pixel. Subsequently, clusters are grouped based on color information and geometry information of the clusters. In this case, geometry information refers to coordinate information representing the proximity between regions, and the like.
However, since the conventional techniques require that a cluster be newly defined and a pixel of interest be allocated to the newly-defined cluster when the distance between feature vectors of a processing object pixel and an adjacent pixel is large, a large number of clusters ends up being defined. Consequently, the processing time required by grouping problematically increases.