Enterprises are increasingly using rule-based systems in order to manage many and various aspects of their businesses. Rule-based systems have been used in a multitude of applications, including: detecting credit card fraud, data quality, lending and credit approval, insurance, securities and capital markets trading cycle, manufacturing operations, telecommunications, logistics, transportation and travel, government, and retail.
Typically, in the prior art, the rules that are used in these systems are created manually through a defined process of analysis, construction, and approval cycle by a group of (human) domain experts. Such manual rule generation, testing, and maintenance can be a challenging proposition. Such rules: require deep (human) domain expertise in addition to specialised skills in data processing; require long lead times to set up; are difficult to maintain in a consistent and clean manner; are inherently focused on prescribing behaviour by replicating past trends as observed in specific instances, making them less able to capture new trends and require constant maintenance; and are generated over a period of time with input from different human experts, creating inconsistencies and reducing accuracy.
By way of example, financial fraud takes many forms, including for example transactional fraud, such as credit card or debit card fraud, application fraud, cash card/ATM fraud etc. A particular example discussed herein is that of credit card fraud. However, the basic principles on which these different types of fraud rely are generally the same or similar for each of the types mentioned above. Consequently, the same basic principles of preferred embodiments of the present invention can be used in detection of these different types of frauds. In general, preferred embodiments of the present invention can be applied to the analysis of any type of human behaviour and/or attributes recorded in a computer system that may relate to a fraud event.
Financial institutions, such as banks, typically currently use a combination of transaction scoring and rule-based techniques to filter transactions and either accept, refer or decline them. Typically, scores are calculated using neural networks and the rules are generated by direct entry by (human) domain experts manually reviewing and creating rules. Transactions are tagged using a numeric score reflecting the likelihood of their being fraudulent. The method has had some impact, typically discovering 60% to 70% of frauds. However, most organisations also incur a high false positive rate, typically more than 15:1 and often as high as 20:1, which generates significant cost and customer dissatisfaction. The pressure to maintain low false positive rates means that even the best manual systems can suffer from high levels of undetected fraud. At the same time, there are increasing demands to handle more transactions, and criminals are becoming increasingly adept at identifying and exploiting new gaps in the processes. These issues all drive the need for a technical solution to the problem of creating and maintaining a set of rules to identify credit card fraud with the required high levels of detection and low false positive rates.
Rule-based systems are also widely used in the domain of data quality, including the identification of anomalies in data. Data quality is becoming increasingly important to monitor and improve. Many corporations and other organisations have invested large sums in building numerous data warehouses to support their information needs. Availability of information and reporting efficiency has been the key driver in their implementation. However, in order to derive more value, it is essential that more attention is paid to the quality of data that they contain. In addition, the regulatory requirements of for example Basel II and Sarbanes Oxley are demanding improvements in data quality. For instance, Basel II requires the collection and maintenance of 180+ fields from multiple source systems. In order to comply, it will be obligatory to follow the principles enforced by controlled definition and measurement. Furthermore, risk data quality improvements will have to be continually measured, controlled and aligned with business value.
Typical data quality systems require a library of business and data compliance rules which are used to measure and monitor the quality of data. The rules in these systems are created and maintained by human analysts, often requiring the assistance of expensive consultants. Because of the underlying complexity of the problem being addressed, the human-created rules suffer from inaccuracy in that they do not completely accurately identify all data quality issues. In addition, they quickly become out of date as the underlying business evolves, as it is not possible for human intervention to manually track the content and form of the underlying data, in the same way that it is not possible for humans to manually record and store the sort of volumes of data that are stored in large-scale modern databases.
It is becoming increasingly necessary, therefore, that the rules in data quality systems are created and maintained automatically by a technical solution.