The use of “grouping methods” (including but not limited to clustering, classification, recommendation, profiling, detection, and others), as methods of creating groups of objects such that objects in one group are similar (minimum within-group distance) and objects in different groups are distinct (maximum between-groups distance), is a common method in various areas of research such as computer science (e.g. pattern recognition), bio-information (e.g. patterns of protein's structure), marketing (e.g. market segmentation, user profiling, product-advertising recommendations), finance (e.g. fraud detection), manufacturing (e.g. faults and defects detection), organization and psychology (e.g. candidates and employees profiling).
However, studies thus far have applied a single algorithm (at a time) to determine the specific grouping of a phenomenon. Because of the unsupervised nature of grouping problems and the different groups resulted by using different algorithms on the same datasets, as well as the nature of several algorithms that may yield different solutions under permutations of the input order of the data; researchers and practitioners tend to decide on the number of groups, as well as the final decision concerning the association of borderline cases-objects without dedicated supportive decisions tools.
The use of grouping methods helps to identify significant trends in the dataset. Each group is comprised of relatively homogeneous objects (also referred to as cases, observations etc.). Objects in a group are similar to each other, and are also dissimilar to objects outside the group, according to the logic/methodology applied by the given grouping method.
For example, given a population of people, it is possible to divide them into different groups according to age groups, residential neighborhood, education, income etc. Each grouping method (criteria) may yield different groups. In many cases different grouping methods will result in similar groupings of the objects, thus identifying important trends about the objects. In the above example, it may well be possible that a given group of people will be identified by all (most) grouping criteria. An example may be the group of wealthy people, of older age, with higher education and living in a certain neighborhood.
Variations of analysis with the same purpose, e.g. dividing the dataset into groups, identifying people in a picture, identifying stock types in a stock exchange, etc. . . . , produce categorization of the dataset as interpreted by the grouping method. Since the grouping methods are different, similar but not identical interpretations can result.
Analyzing the results of different grouping methods is not a trivial task. It would be desirable to develop a method that can identify similarities between the groupings suggest by multiple grouping methods in order to produce an overall grouping recommendation. It would also be desirable to provide a two-dimensional visualization method in order to understand the similarities between the different grouping methods recommendations and the similarities of objects regarding the desired grouping, identifying outliers and hard to group objects.
Semi and unstructured decisions are frequently arise in daily applications such as hazards detection, marketing (recommendation, segmentation), finance (pricing), medicine (diagnostics).
In order to reach a well-established decision, in each of the above situations, there is a need to analyze it using several models, that is to say using several algorithms and parameters.
Currently, researchers must analyze each algorithm and parameter on an individual basis in order to establish preferences on the decision-making issues they face; because there is no supportive model or tool, which enables comparing different results, generated by these algorithms and parameters combinations.
The current invention enables not only visualization of results produced by diverse algorithms, but also quantitative analysis of the various results.