Enterprises are increasingly using rule-based systems in order to manage many and various aspects of their businesses. Rule-based systems have been used in a multitude of applications, including: detecting credit card fraud, data quality, lending and credit approval, insurance, securities and capital markets trading cycle, manufacturing operations, telecommunications, logistics, transportation and travel, government, and retail.
Typically, in the prior art, the rules that are used in these systems are created manually through a defined process of analysis, construction and approval cycle by a group of (human) domain experts. Such manual rule generation, testing, and maintenance can be a challenging proposition. Such rules: require deep (human) domain expertise in addition to specialised skills in data processing; require long lead times to set up; are difficult to maintain in a consistent and clean manner; are inherently focused on prescribing behaviour by replicating past trends as observed in specific instances, making them less able to capture new trends and requiring constant maintenance; and are generated over a period of time with input from different human experts, creating inconsistencies and reducing accuracy.
Rule-based systems are widely used in the domain of data quality, including in the identification of anomalies in data. It is becoming increasingly important to monitor and improve data quality. Many corporations and other organisations have invested large sums in building numerous data warehouses to support their information needs. Availability of information and efficiency of reporting have been the key driver in their implementation. However, in order to derive more value, it is necessary that more attention is paid to the quality of the data that they contain. In addition, the regulatory requirements of for example Basel II and Sarbanes-Oxley are demanding improvements in data quality. For instance, Basel II requires the collection and maintenance of over 180 fields from multiple source systems. In order to comply, it will be obligatory to follow the principles enforced by controlled definition and measurement. Furthermore, the risk inherent in data quality will have to be continually measured, controlled and aligned with business value.
Typical data quality systems require a library of business and data compliance rules, which are used to measure and monitor the quality of data. The rules in these systems are created and maintained by human analysts, often requiring the assistance of expensive consultants. Because of the underlying complexity of the problem being addressed, the human-created rules suffer from inaccuracy in that they do not identify all data quality issues completely accurately. In addition, they quickly become out of date as the underlying business evolves, as it is not possible for human intervention to manually track the content and form of the underlying data, in the same way that it is not possible for humans to manually record and store the sort of volumes of data that are stored in large-scale modern databases.
It is becoming increasingly necessary, therefore, that the rules used in data quality systems are created and maintained automatically by a technical solution.