Typical software applications that may apply clustering techniques usually cluster static data sets. Many software applications today may also cluster a large static data set at one point in time and then may later cluster a changed representation of the large static data set. For example, the large data set may represent email membership of a large online network that may be clustered at the beginning of each month in a calendar year. Because the static data sets representative of the email membership may change from month to month, there may be shifts in cluster membership from month to month. As a result, static clustering techniques that may accurately identify monthly clusters of email membership may not identify and track annual clusters as accurately as those that model the email membership for the calendar year. Unfortunately, such static clustering algorithms may produce a poor clustering sequence over time.
What is needed is a way to consistently cluster a large data set over time while accurately clustering each data set collected at periodic intervals. Any such system and method should provide a generic framework that may support the use of various clustering methods.