The problem of mining (discovering) frequent itemsets in a set of transactions, wherein each transaction is a set of items, was first introduced in R. Agrawal et al., “Mining Association Rules Between Sets of Items in Large Databases, VLDB, pp. 207–216, 1993, the disclosure of which is incorporated by reference herein. Agrawal et al. also led the pioneering work of mining sequential patterns, see R. Agrawal et al., “Mining Sequential Patterns, ICDE, 1995, the disclosure of which is incorporated by reference herein. As is known, each item in a pattern is represented by a set of literals (e.g., a literal may be “beer” or “diaper,” as in a “market basket” scenario), and patterns on level k (patterns with a total number of k literals) are generated by joining patterns on level k−1.
Unfortunately, user preferences can not be incorporated into the Agrawal et al. mining process. The methodology mines (unnecessarily) all frequent itemsets before the answers to mining targets of a user can be filtered out during the final step. Consideration of user preferences apparently slows down this mining process when the dataset is large and high-dimensional, and it also creates difficulty in understanding the mining results since patterns are all mixed together.
Further, Srikant et al. (see R. Srikant et al., “Mining Generalized Association Rules,” VLDB, pp. 407–419, 1995, the disclosure of which is incorporated by reference herein) and Han et al. (see J. Han et al., “Discovery of Multiple-level Association Rules from Large Databases,” VLDB, 1995, the disclosure of which is incorporated by reference herein) consider multi-level association rules based on item taxonomy and hierarchy. These approaches are further extended to handle more general constraints, see R. Ng et al., “Exploratory Mining and Pruning Optimizations of Constrained Associations Rules, SIGMOD, pp. 13–24, 1998, and R. Srikant et al., “Mining Association Rules with Item Constraints, SIGKDD, pp. 67–93, 1997, the disclosures of which are incorporated by reference herein.
Unfortunately, with a pre-specified hierarchy or taxonomy, the mining space is severely restricted. The ability to discover patterns depends totally on whether these patterns happen to fit into the given taxonomy or hierarchy.
Still further, Shen et al. developed meta-queries for Bayesian data clusters using templates expressed as second-order predicates, see W. Shen et al., “Meta-queries for Data Mining,” pp. 375–398, AAAI/MIT press, 1996, the disclosure of which is incorporated by reference herein. Fu et al. (see Y. Fu et al., “Meta-rule-guided Mining of Association Rules in Relational Databases,” Proceedings 1st Int'l Workshop on Integration of KDOOD, pp. 39–46, 1995, the disclosure of which is incorporated by reference herein) and Kamber et al. (see M. Kamber et al., “Meta-rule-guided Mining of Multi-dimensional Association Rules Using Data Cubes, SIGKDD, pp. 207–210, 1997, the disclosure of which is incorporated by reference herein) extend meta-queries to relational databases and multi-dimensional data cubes, respectively. Since meta-rules are viewed as rule templates expressed as a conjunction of predicates instantiated on a single record, they do not consider multi-attribute patterns formed from multiple records.
More recently, there has been work on different kinds of multi-attribute mining circumstances (see G. Grahne et al., “On Dual Mining: From Patterns to Circumstances, and Back,” ICDE, 2001, the disclosure of which is incorporated by reference herein) and dynamic groupings (see C.-S. Perng et al., “A Framework for Exploring Mining Spaces with Multiple Attributes,” ICDM, 2001, the disclosure of which is incorporated by reference herein), where arbitrary sets of attributes are used to group items into transactions.
However, a need exists for attribute association discovery techniques that support relational-based data mining that overcome the above and other deficiencies.