Not Applicable
The following publications which are referenced herein using numbers in square brackets (e.g., [1]) are incorporated herein by reference:
[1] R. A. Jacobs and M. I. Jordan, xe2x80x9cAdaptive Mixtures of Local Experts,xe2x80x9d Neural Computation 3, pp. 79-87, 1991.
[2] D. DeMers and G. Cotrell, xe2x80x9cNon-Linear Dimensionality Reduction,xe2x80x9d Advances in Neural Information Processing Systems, pp. 580-587, 1993.
[3] G. Cybenko, xe2x80x9cApproximation by Superpositions of a Sigmodal Function,xe2x80x9d Mathematics of Control, Signals, and Systems, Vol. 2, pp. 303-314, 1989.
[4] J. Mao and A. K. Jain, xe2x80x9cArtificial Neural Networks for Feature Extraction and Multivariate Data Projection,xe2x80x9d IEEE Transactions on Neural Networks, Vol. 6, No. 2, March 1995.
[5] J. Sklansky and L. Michelotti, xe2x80x9cLocally Trained Piecewise Linear Classifiers,xe2x80x9d IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 6, No. 2, pp. 195-222, 1989.
[6] R. Caruana, xe2x80x9cLearning Many Related Tasks at the Same Time,xe2x80x9d Advances in Neural Information Processing Systems 7, pp. 657-664, 1995.
[7] L. K. Hansen and P. Salamon, xe2x80x9cNeural Network Ensembles,xe2x80x9d IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 12, No. 10, pp. 993-1001, October 1990.
[8] Y. Park, xe2x80x9cA Comparison of Neural Net Classifiers and Linear Tree Classifiers: Their Similarities and Differences,xe2x80x9d Pattern Recognition, Vol. 27, No. 11, pp. 1494-1503, 1994.
[9] T. Kohonen, Self-Organization and Associative Memory, Second Edition, Springer-Verlag, Berlin, 1988.
[10] E. Y. Tao and J. Sklansky, xe2x80x9cAnalysis of Mammograms Aided by Database of Images of Calcifications and Textures,xe2x80x9d Proc. of 1996 SPIE Conf. on Medical Imagingxe2x80x94Computer-Aided Diagnosis, February 1996.
[11] B. Lofy, O. Pxc3xa4tz, M. Vriesenga, J. Bernarding, K. Haarbeck, J. Sklansky, xe2x80x9cLandmark Enhancement for Spoke-Directed Anisotropic Diffusion,xe2x80x9d Proc. of the IAPR Workshop on Methods for Extracting and Mapping Buildings, Roads, and other Man-Made Structures from Images, Technical University, Graz, Austria, September 1996.
[12] H. C. Zuckerman, xe2x80x9cThe Role of Mammography in the Diagnosis of Breast Cancer,xe2x80x9d in Breast Cancer, Diagnosis, and Treatment, eds. I. M. Ariel and J. B. Cleary, Chap. 12, McGraw-Hill, N.Y. pp. 152-172, 1987.
[13] A. P. M. Forrest and R. J. Aitken, xe2x80x9cMammography Screening for Breast Cancer,xe2x80x9d Ann. Rev. Medicine 41, pp. 117-132, 1990.
[14] M. Vriesenga and J. Sklansky, xe2x80x9cGenetic Selection and Neural Modeling of Piecewise-Linear Classifiers,xe2x80x9d International Journal of Pattern Recognition and Artificial Intelligence, Vol. 10, No. 5, pp. 587-612, 1996.
1. Field of the Invention
This invention pertains generally to classifiers constructed in the form of neural networks, and more particularly to neural classifiers that can map design data and decision curves on the same two-dimensional display.
2. Description of the Background Art
The applications to which neural networks can be applied continues to expand. Examples include medical analysis, character recognition, speech recognition, remote sensing, and geophysical prospecting among others.
An example of the use of neural networks for medical analysis can be found in U.S. Pat. No. 5,872,861 issued to Makram-Elbeid on Feb. 16, 1999, which is incorporated by reference herein. That patent describes a method for processing digital angiographic images for automatic detection of stenoses in blood vessels using a neural network. The digital image is processed in order to determine the central points and the edge points of the objects represented, provided that these objects constitute sufficiently uniform, contrasting masses on a sufficiently uniform background. The neural network with a hidden layer and two outputs is used to determine the probability that a potential stenosis is real or concerns a false alarm. The input of the neural network receives a vector whose components are characteristic traits of a candidate stenosis detected by means of the above method. The vector may be formed, for example by the intensities of the pixels of the icon of the candidate stenosis. The two outputs of the neural network encode the class of the non-stenoses (output 1) and that of the stenoses (output 2), respectively. Once reduced to the interval (0,1) by a mathematical transformation, the two activations of the output of the network can be interpreted as probabilities of association with either the class 1 or the class 2, given the vector of characteristic traits (probabilities a posterior). The two probabilities are stored for each of the candidate stenoses. This enables the operator himself to define the degree of reliability, on a scale of probability, so as to retain or reject a candidate stenosis. The storage of the probabilities enables the user to try out several reliability levels without having to repeat the entire procedure for the detection and recognition of stenoses as described above. A graphic display method visualizes the stenosis retained in each of the individual images.
An example of neural networks applied to character recognition can be found in U.S. Pat. No. 5,859,925 issued to Yaeger et al. on Jan. 12, 1999, which is also incorporated herein by reference. As explained by Yaeger et al., various classification algorithms are available based on different theories and methodologies used in the particular area. In applying a classifier to a specific problem, varying degrees of success with any one of the classifiers have been obtained and, to improve the accuracy and success of the classification results, different techniques for combining classifiers have been studied. Nevertheless, problems of obtaining a high classification accuracy within a reasonable amount of time exist for the present classifying combination techniques and an optimal integration of different types of information is therefore desired to achieve high success and efficiency. Accordingly, combinations of multiple classifiers have been employed. However, none of the conventional approaches achieve the desired accuracy and efficiency in obtaining the combined classification result. The solution provided by Yaeger et al. is a classifying system having a single neural network in which multiple representations of a character are provided as input data. The classifying system analyzes the input representations through appropriate combination of their corresponding sets of data in the neural network architecture.
Another way to enhance classification performance is to use multi-expert neural classifiers [1]. This can result in computational complexity, so attempts have been made to use networks with two-neuron hidden layers. The prevailing view, however, is that networks with only two-neuron hidden layers do not have the capacity to perform large scale classification tasks and can only be used for exploratory data analysis [4] or data compression [2]. Therefore, there is a need for a classifier for networks with two-neuron hidden layers that combines information provided by several classification tasks into a visually meaningful and explanatory display, and that can display a large database of cases or objects.
The present invention satisfies the foregoing needs by providing a neural classifier that combines information provided by several classification tasks into a visually meaningful and explanatory display. We refer to this as a xe2x80x9cvisual neural classifier.xe2x80x9d Using the invention a designer can identify difficult-to-classify input patterns that may then be applied to an additional classification stage.
A visual neural classifier according to the invention comprises two major elements: (a) a set of experts and (b) a visualization network. Visualization is accomplished by a funnel-shaped multilayer dimensionality reduction network [2]. The dimensionality reduction network is configured to learn one or more classification tasks. If a single dimensionality reduction network does not provide sufficiently accurate classification results, a group of these dimensionality reduction networks may be arranged in a modular architecture [1]. Among these dimensionality reduction networks, we refer to those receiving the input data as experts. We refer to the dimensionality reduction network that combines the decisions of the experts to form the final classification decision a visualization network.
Each expert comprises a multilayer neural network that reduces the multidimensional feature space through successive layers to a two-neuron layer for visualization. Each dimensionality reduction network contains a two-neuron layer that displays the training data and the decision boundaries in a two-dimensional space. This architecture facilitates (a) interactive design of the decision function and (b) explanation of the relevance of various training data to the classification decisions. By combining the use of use of experts with the visualization network as described, the visual neural classifier of the present invention provides both excellent classification accuracy and good visual explanatory power.
For each classification task a distinct neural network is connected to the two-neuron layer. Each of these networks expands to a layer containing a number of neurons equal to the number of classes. The classifier can display a large database of cases or objects on a xe2x80x9crelational map.xe2x80x9d Each object can be represented on the relational map as a colored point, such as black, gray or red. The color discriminates the class or subclass to which the point belongs. The network can also produce one or more decision curves that partition the relational map into decision regions, each decision region associated with an assignment of points to a unique class. Furthermore, the classifier can be trained to produce a relational map in which identically colored points form clusters. Also, the location of the points on the relational map provides an indication of decision difficulty. Points that are close to a decision curve are difficult to classify, and those that are far from all decision curves are easy to classify.
The relational map allows a user to browse a large database of objects, and quickly retrieve similar objects. It also provides decision support to a user. A user can observe the decision associated with a particular object and observe the decision uncertainty associated with the distance of the object from the nearest decision curve on the relational map. The user can then integrate this information with information retrieved from related objects to produce an enhanced decision. In addition, the relational maps provide a powerful means for interactive design of the classifier.
Further objects and advantages of the invention will be brought out in the following portions of the specification, wherein the detailed description is for the purpose of fully disclosing preferred embodiments of the invention without placing limitations thereon.