For several years a growing number of in-depth studies have been carried out in the field of data processing by means of neural networks. In contrast to computers which function in a sequential and synchronous manner, a neural network, as interpreted in neurobiological terms, is deemed to function more or less as the human brain, with an information storage which is no longer limited to specific localized memories, but which is distributed all through the network. The advantages of structures based on neural networks then are essentially the parallel and asynchronous processing, the learning capacity, a low sensitivity to noise, and a high resistance to breakdowns use of the distribution of the information carrying elements).
A neuron, the elementary cell of the network, in its simplest version as shown in FIG. 1, is a threshold device (therefore non-linear) which delivers an output signal Y (or activation signal, or output potential) whose amplitude depends on the result of the comparison between on the one hand a given threshold S and on the other hand a sum of signals reaching the neuron from a series of elements placed upstream, which signals in their turn are formed from input signals x.sub.1, x.sub.2, . . . , x.sub.N weighted with respective coefficients w.sub.1, w.sub.2, . . . , w.sub.N called synaptic coefficients. According to this arrangement, which has by now become traditional, therefore, the neuron realises a weighted sum of the action potentials which it receives (i.e. of the numerical values representing the states of the neurons which have emitted these potentials), and then is itself activated if this weighted sum exceeds a certain threshold, the neuron thus activated transmitting a response in the form of a new action potential (a non-activated neuron does not transmit anything).
Among the most common neural network types one can distinguish especially the layered networks in which the neurons are arranged in successive layers, each neuron being connected to all neurons of the next layer, while the information passes from the input layer to any subsequent layers (which layers are then called hidden layers) until it reaches the output layer. Such multilayer networks appear to be particularly suitable for the resolution of classification problems. It is in fact possible to consider an entire series of examples to be classified, in which each example, defined by a set of dam, can be qualified by a data vector in a hyperspace, the data associated with this example forming the coordinates of the said vector in this hyperspace. Taking into account a predetermined input vector X, it is demonstrated then that the output activations of a multilayer network provide probability values afterwards, represented as P(C/X), denoting the probability that a sample belongs to various possible classes in accordance with the input vector corresponding to each example.
To carry out a given process, however, a neural network must first learn to carry it out in the course of a so-called learning stage: during a first period in which the neural network, whose characteristics have not yet been adapted to the envisaged task, will deliver erroneous results, an error of the obtained results is determined and then the parameters of the network (thresholds and synaptic coefficients) are modified on the basis of a correction criterion so as to enable this network to adapt itself progressively to the input information which it receives. This correction step is repeated for the number of examples (or input vectors) considered necessary for a satisfactory learning process of the network.
This learning phase which precedes that of the normal operation of the network is called a supervised phase since the error is evaluated through comparisons between the results obtained and those which should have been obtained, which are known in advance in this case. The parameters of the neural network are modified depending on a discrepancy between the obtained and the desired outputs, for example by error back propagation. In the following description, however, the learning process discussed is called unsupervised since the results to be obtained are not known in advance, either because such prior knowledge is not possible or because the cost of obtaining it is too high. An unsupervised learning process, therefore, means that the assignment of each of the vectors or teaching prototypes to a class must take place without any previous knowledge of the desired output values. Numerous documents have described the principle of the unsupervised teaching base, for example, the publication "Learning to recognize patterns without a teacher", IEEE-IT-13, no. 1, January 1967, pp. 57-64, or "Optimal unsupervised learning multicategory dependent hypotheses pattern recognition", IEEE-IT-14, no. 3, May 1968, pp. 468-470.
The operational phase after learning is then a phase of generalization in which the parameters of the neural network are fixed as being judged to be correct. In the course of this phase it is possible to carry out the classification of other test vectors than those of the teaching base, whereby in fact a state of the outputs permitting its classification corresponds to each of these test vectors. In short, during such a phase, the essential thing is to subdivide a set of input vectors X into different classes j=1, 2, 3, . . . , K-1, K, while which the neural network must for this purpose learn to deliver for each input vector (or test vector) an activation which for each output neuron j is the estimation of the probability that this prototype belongs to a given class j.
In the application of image segmentation chosen here, a certain number of image elements or "pixels" serve as test examples and render it possible to derive textural or other characteristics from images observed. The network will progressively learn to classify these test examples, and will then generalize so as to classify other pixels of the image.
A known learning process, the so-called "moving centers" process which is widely used, is described, in particular in the article "Cluster methodologies in exploratory data analysis", by R. Dubes and A. K. Jain, published in "Advances in Computing", vol. 19 (1980), pp. 113-228, and "Discriminant analysis and clustering", published in "Statistical Science", vol. 4 (1989), pp. 34-69. Such a process consists in that the space of the input vectors is progressively divided into several zones in dependence on the proximity of this vector to different points around which points clouds of vectors corresponding to different classes are progressively formed.