The present disclosure is directed toward a system and method for simultaneously clustering multiple data sets and verifying homogeneity in the generated clusters. The system finds application in educational recommendation systems, but there is no limitation made herein to the type of data sets applied to the disclosed algorithms.
In the past few years, school districts have begun to use educational recommendation methods and systems for a number of benefits. These systems generally employ the various functionalities of multifunction devices (“MFDs”), such as copiers including scanning capabilities, to analyze the results of tests administered to students. The conventional system can automatically lift the student's answers from an answer sheet—after scanning in the answer sheet—and, in certain approaches, use a stored rubric to evaluate and score the results. Such a system enables the teacher to devote more learning time to students, which the teacher would otherwise spend manually grading the sheets. As educational recommendation systems advanced in the past few years, the systems can also use the results to customize the curriculum of students in need of specialized instruction and/or teacher-assistance.
In the current educational assessment and/or recommendation system (hereinafter collectively referred to as “ERS”), cluster analysis is performed to create appropriate groupings of students for a specific purpose—such as, e.g., balancing a classroom, identifying groups of students needing specialized intervention, and determining the range of abilities among students in a classroom, etc. The current ERS automates this process so the teacher has more time to focus its attention on matters of higher priority. Generally, current approaches for clustering use k-means and hierarchical clustering algorithms to find optimal partitions within a data set.
Current ERSs can also scale the groupings of students in the balanced classroom(s) to smaller sets. One exemplary goal of such scaling operation is to create peer learning groups where stronger students are paired with weaker students for working together on an exercise. In different embodiments, students can instead be grouped by instructional level so the teacher can focus on personalized instruction.
The algorithms required to create personalized clusters become more complex where each student is linked to a combination of parameters (“multiple sets of data”) representing, for example, ability, performance, characteristics (s.a., age and gender, etc.). The challenge for creating homogeneous clusters increases when multiple parameters are considered for the clustering students. There is desired an approach for clustering students that can treat two sets of data simultaneously. More specifically, a clustering method is desired which generates homogeneous clusters. In addition to generating clusters, there is further desired an approach that can define characteristics of the cluster for addressing a goal of such system.