1. Knowledge Discovery and Data Mining (KDD)
Databases today can range in size into the terabytes of data and soon the pentabytes arena. Within these masses of data lies hidden information of strategic importance. Data Mining is the powerful new technology following OLAP tools with great potential to help companies focus on the most important information in database and data warehouse.
Innovative organizations are already using data mining to locate and appeal to higher-value customers, reconfigure their product offerings to increase sales, and minimize losses due to error or fraud.
GST-DSS (General System Theory Based Decision Support System) has automatic Data Mining and Knowledge Discovery(KDD) tool based on rule induction mechanism by extended Prolog(ext-Prolog) engine. KDD tool can be applied directly to major operational database and data warehouse by built-in DBMS interface. Users can understand the data by expressed rule database and find predictive information even experts may miss.
2. Automatic Discovery System
GST-DSS has built-in KDD component based on rule induction by ext-Prolog engine. Logical rules in database are usually explored and expressed as conditional or affinity relationships.
Logical rule has the following form:
IFSex = MaleANDItem = DiaperTHENItem = Beer(Confidence = 80%)(Support = 25%)
Here logical conditions (IF Sex=Male AND Item=Diaper) and associations (THEN Item=Beer) are combined where confidence factor of 80%(Confidence) and coverage of 25% to the entire records(Support). This form of hybrid structure delivers the same notation with Prolog logic rules.
The rules have the advantage of being able to deal with numeric and character data in a uniform manner. When dealing with numeric data, prior approaches have to break numeric fields into “codes” or specific category values. Also, rules may easily go beyond attribute-value representations such as “Import_Country=Export_Country”. Here, we compare the values of two columns, without explicitly naming any values. This relationship cannot be stated by decision trees or cross-tabs approach. This rule induction can discover general rules and can be easily understood by business users.