The present invention relates to automatic data clustering, which is particularly useful in pattern recognition, for example in speech pattern recognition or image pattern recognition or text/character pattern recognition.
Pattern recognition, and particularly data clustering involves a large volume of sample data for learning and a determination of a category of input data for pattern recognition, and more particularly the identifying and classifying of data representing a speech pattern or an image pattern, for example.
In pattern recognition, there are generally two representative methods that employ learning with high volumes of known sample data having plural categories for determining a region that each known category occupies in a pattern space and then determining the category of unknown data according to the region of the unknown data. The two methods may be represented by the Artificial Intelligence Handbook, published by Ohm, "Pattern Matching", page 324, compiled by the Artificial Intelligence Association; and Chapter 8 of Parallel Distributed Processing, entitled "Learning Internal Representations by Error Propagation", by D. E. Rumelhart and others, compiled by the Institute for Cognitive Science, University of California, San Diego. Specifically, the two methods that employ the above are:
(1) Pattern matching by preparing the standard pattern for each category and taking as the category for unknown input data a category whose standard pattern is nearest to the input data. There are some pattern matching methods that do not prepare standard patterns, but instead they use sample data of known categories and then take the category of the sample data which are closest to the input data to be the category of the unknown input data, which is known as the nearest neighbor method.
(2) A layer-type neural network arranges non-linear units called neurons in layers and learns transformation rules between sample data and the categories as weights between neurons. A commonly used learning method is back propagation based upon the steepest descent method. Output data produced when unknown input data are given to the neural network are taken as the category of the unknown input data.