1. Field of the Invention
The present invention relates generally to n-tuple or RAM based neural network classification systems and, more particularly, to n-tuple or RAM based classification systems where the decision criteria applied to obtain the output scores and compare these output scores to obtain a classification are determined during a training process.
2. Description of the Prior Art
A known way of classifying objects or patterns represented by electric signals or binary codes and, more precisely, by vectors of signals applied to the inputs of neural network classification systems lies in the implementation of a so-called learning or training phase. This phase generally consists of the configuration of a classification network that fulfils a function of performing the envisaged classification as efficiently as possible by using one or more sets of signals, called learning or training sets, where the membership of each of these signals in one of the classes in which it is desired to classify them is known. This method is known as supervised learning or learning with a teacher.
A subclass of classification networks using supervised learning are networks using memory-based learning. Here, one of the oldest memory-based networks is the “n-tuple network” proposed by Bledsoe and Browning (Bledsoe, W. W. and Browning, 1, 1959, “Pattern recognition and reading by machine”, Proceedings of the Eastern Joint Computer Conference, pp. 225–232) and more recently described by Morciniec and Rohwer (Morciniec, M. and Rohwer, R., 1996, “A theoretical and experimental account of n-tuple classifier performance”, Neural Comp., pp. 629–642).
One of the benefits of such a memory-based system is a very fast computation time, both during the learning phase and during classification. For the known types of n-tuple networks, which is also known as “RAM networks” or “weightless neural networks”, learning may be accomplished by recording features of patterns in a random-access memory (RAM), which requires just one presentation of the training set(s) to the system.
The training procedure for a conventional RAM based neural network is described by Jørgensen (co-inventor of this invention) et al. in a contribution to a recent book on RAM based neural networks (T. M. Jørgensen, S. S. Christensen, and C. Liisberg, “Cross-validation and information measures for RAM based neural networks,” RAM-based neural networks, J. Austin, ed., World Scientific, London, pp. 78–88, 1998). The contribution describes how the RAM based neural network may be considered as comprising a number of Look Up Tables (LUTs). Each LUT may probe a subset of a binary input data vector. In the conventional scheme the bits to be used are selected at random. The sampled bit sequence is used to construct an address. This address corresponds to a specific entry (column) in the LUT. The number of rows in the LUT corresponds to the number of possible classes. For each class the output can take on the values 0 or 1. A value of 1 corresponds to a vote on that specific class. When performing a classification, an input vector is sampled, the output vectors from all LUTs are added, and subsequently a winner takes all decision is made to classify the input vector. In order to perform a simple training of the network, the output values may initially be set to 0. For each example in the training set, the following steps should then be carried out:
Present the input vector and the target class to the network, for all LUTs calculate their corresponding column entries, and set the output value of the target class to 1 in all the “active” columns.
By use of such a training strategy it may be guaranteed that each training pattern always obtains the maximum number of votes on the true class. As a result such a network makes no misclassification on the training set, but ambiguous decisions may occur. Here, the generalisation capability of the network is directly related to the number of input bits for each LUT. If a LUT samples all input bits then it will act as a pure memory device and no generalisation will be provided. As the number of input bits is reduced the generalisation is increased at an expense of an increasing number of ambiguous decisions. Furthermore, the classification and generalisation performances of a LUT are highly dependent on the actual subset of input bits probed. The purpose of an “intelligent” training procedure is thus to select the most appropriate subsets of input data.
Jørgensen et al. further describes what is named a “leave-one-out cross-validation test” which suggests a method for selecting an optimal number of input connections to use per LUT in order to obtain a low classification error rate with a short overall computation time. In order to perform such a cross-validation test it is necessary to obtain a knowledge of the actual number of training examples that have visited or addressed the cell or element corresponding to the addressed column and class. It is therefore suggested that these numbers are stored in the LUTs. It is also suggested by Jørgensen et al. how the LUTs in the network can be selected in a more optimum way by successively training new sets of LUTs and performing cross validation test on each LUT. Thus, it is known to have a RAM network in which the LUTs are selected by presenting the training set to the system several times.
The output vector from the RAM network contains a number of output scores, one for each possible class. As mentioned above a decision is normally made by classifying an example in to the class having the largest output score. This simple winner-takes-all (WTA) scheme assures that the true class of a training examples cannot lose to one of the other classes. One problem with the RAM net classification scheme is that it often behaves poorly when trained on a training set where the distribution of examples between the training classes are highly skewed. Accordingly there is a need for understanding the influence of the composition of the training material on the behaviour of the RAM classification system as well as a general understanding of the influence of specific parameters of the architecture on the performance. From such an understanding it could be possible to modify the classification scheme to improve its performance and competitiveness with other schemes. Such improvements of the RAM based classification systems is provided according to the present invention.