Starting from a database formed from a set (or population) of n objects described by a set of m descriptors (or variables), the automatic classification consists in structuring these objects in the form of very homogeneous classes (or groups). This homogeneity means that two objects of the same class must be more similar to (or resemble) one another than two objects belonging to two different classes.
The formation of these classes will allow groups of objects with similar profiles or themes to be easily detected depending on whether the data is of the structured or unstructured type.
This problem has too many permutations and combinations to be solved by an exact method. For this reason, heuristic algorithms that are less costly in terms of processing time and machine resources have been generated in order to find approximate solutions to it.
Some of these heuristic algorithms offer solutions by arbitrarily fixing the number of classes, whereas others propose a hierarchy having partitions with a variable number of classes.
For example, the following heuristic algorithms may be mentioned:                The methods of the “mobile centers” type, such as “k-means”, dynamic clustering, etc. . . . .        The methods of hierarchical classification (increasing or decreasing)        The methods of the “first leader” type, etc.        
Examples for various unsupervised methods of classification are given in the following references: 1) Saporta G. (1990), Probabilités, Analyse de données et Statistique, Technip: 2) Lebart and al (1995), Multidimensional exploratory statistics, Dunod: 3) Hartigan, J. (1975), Clustering Algorithms, John Wiley and Sons, New York, N.Y., US.
The methods of the “mobile centers” and hierarchical classification type arbitrarily fix a number of classes. On the other hand, the methods of the “first leader” type require a similarity threshold to be fixed and are dependent on the order in which the objects are taken into account. Indeed, they may lead to completely different results depending on the order in which the objects are arranged. Nevertheless, they do allow large quantities of data to be processed within reasonable times. However, in order to achieve this performance, these methods require the maximum number of classes to be fixed at a very small number with respect to the number of objects.
Amongst the major problems encountered in dealing with the issue of automatic classification may be mentioned:                the determination of the number of classes underlying the population in question,        the performance in terms of processing times depending on the volume sizes to be processed and in terms of quality of the homogeneity of the classes obtained,        the capability of interpretation of the results obtained: definition of statistical indicators for the measurement of the homogeneity of the classes together with the discriminating power of the descriptors participating in the formation of these classes.        
The idea of the present invention rests notably on the theory of relational analysis. As a reminder, this theory, such as described in one of the following references: 1) P. Michaud and J F Marcotorchino, “Optimization models in rational data analysis”, Mathématiques et Sciences Humaines no 67, 1979, p 7-38: 2: J F Marcotorchino and P Michaud, “Aggregation of the similarities in automatic classification”, Revue def statistique appliquée, Vol 30, no 2, 1981, provides a solution to the problems associated with the fixing of the number of classes and of the interpretation of the result obtained. However, the underlying theoretical model is extremely costly in terms of machine resources whenever the number of objects exceeds 100. The invention uses a heuristic approach of this theory which allows the theoretical result on large databases to be very closely approximated.