A device exists in the prior art that measures a series of breast skin surface potentials for the purpose of detecting breast cancer (See U.S. Pat. Nos. 5,697,369; 5,678,547; 5,660,177; 5,560,357; 5,427,098; 5,320,101; 5,099,844; and 4,955,383, each of which is incorporated herein by reference). In addition to the device for collecting skin surface potential data, the prior art also teaches several techniques for using these skin surface potentials to predict the likelihood of breast cancer. In particular, U.S. Pat. No. 5,697,369 teaches using a neural network to process skin surface potential data to detect cancer in a suspect skin region. However, noise and confounding physiological signals make the training task for a neural network a particular challenge for use in predicting breast cancer.
Various other forms of neural network architectures exist such as those disclosed in Jacobs et al. "Adaptive Mixtures of Local Experts," Neural Computation, Vol. 3, pp. 79-87 (1991); Waterhouse et al., "Classification Using Hierarchical Mixtures of Experts," Proc. 1994 IEEE on Neural Networks for Signal Processing IV, pp. 177-186 (1994); and Jordan et al., "Hierarchical Mixtures of Experts and the EM Algorithm," Neural Computation, Vol. 6, pp. 181-214 (1994), which are hereby incorporated herein by reference. FIG. 1 depicts a functional block diagram of a two-level hierarchical mixture of experts for a neural network 100 in accordance with the prior art. This architecture uses a plurality of hierarchically arranged expert networks 102A-102D (experts) to classify input data x. Gating networks 104A and 104B process the output result from each expert network 102A-102D using a gating parameter g. The gated expert results are then summed (in combiners 106A and 106B) at a node of the neural network. The results are then gated by gating network 108 and coupled to the next summing node 110. In this manner the data (represented as vector x) is used to control both the gates and the experts. Each of the gates apply a weighting values to the expert outputs where the weighting values depend upon the input vector x such that the neural network 100 operates non-linearly. The use of weighted gating forms a network that uses "soft" partitioning of the input space and the expert networks provide local processing within each of the partitions. The soft partitioning network can be trained using an Expectation-Maximization (EM) algorithm.
Heretofore a neural network containing a mixture of experts has not been applied to the complex data set of skin potential data and patient information to detect breast cancer. Therefore, there is a need in the art for an improved method and apparatus for training and operating a neural network to provide an accurate technique for breast cancer detection.