1. Technical Field
The present invention relates generally to computer software and, more particularly, to a method and system for data entry for cluster analysis.
2. Description of Related Art
Cluster analysis is an exploratory data analysis tool for solving classification problems. Its object is to sort cases (people, things, events, etc.) into groups, or clusters, so that the degree of association is strong between members of the same group and weak between members of different groups. Each such cluster thus describes, in terms of the data collected, the class to which its members belong. This description may be abstracted through use from the particular to the general class or type.
Cluster analysis is thus a tool of discovery. It may reveal associations and structure in data which, though not previously evident, nevertheless are sensible and useful once found. The results of cluster analysis may contribute to the definition of a formal classification scheme, such as a taxonomy for related animals, insects or plants; or suggest statistical models with which to describe populations; or indicate rules for assigning new cases to classes for identification and diagnostic purposes; or provide measures of definition, size and change in what previously were only broad concepts; or find exemplars to represent classes.
One useful application of cluster analysis is to analyze data from card-sorting exercises. In a card-sorting procedure, representative users of a product or technology arrange cards representing data objects into groups on the basis of their perceived relatedness. Cluster analysis of the resulting groups can help researchers to understand users"" perceptions of the degree of relatedness of items in data sets.
Multiple software packages are currently available that allow developers to utilize cluster analysis to translate user expectations into a meaningful organization of a web site. However, currently available cluster analysis software is prohibitively difficult to use for non-professional statisticians. These software packages require the user to calculate and construct similarity or distance matrices from raw data. Only after these matrices have been painstakingly constructed will the packages perform cluster analyses. Therefore, it would be advantageous to have a method and apparatus that provides a simpler user interface and method for users to enter raw data into a cluster analysis software package. Furthermore, a cluster analysis software package that does not require the user to perform numerous calculations or construct matrices would also be advantageous.
The present invention provides a graphical user interface for use in a data processing system for facilitating data entry for cluster analysis. In a preferred embodiment, the graphical user interface includes a source card list area, a participants area, a first sort area, and a second sort area. The source card list area allows entry and display of, and direct manipulation access to, all of a plurality of items to be sorted. The participants area allows entry, display and editing of participant names. The first sort area includes a plurality of first sections, each of which may contain a set of items dragged from the source card list area. Each of these first sections represents a first-level grouping of the items from the source card list area. The second sort area includes a plurality of second sections. Each of the plurality of second sections may contain items dragged from at least one of the first sections, and represents a second-level grouping of the items from the source card list area.