Businesses generally log various kinds of transaction data. For example, systems related to the management of travel and entertainment expenses may process a large number of transactions involving expenses and exceptions. A large enterprise might be expected to process hundreds of thousands of expense claims with each claim containing many individual expense items in various categories. An enterprise may desire to monitor and manage these business processes by performing audits to investigate cases that are potentially in violation of policies and also to determine when business controls are not being exercised appropriately. Effective use of audit resources generally requires that investigations identify cases that merit further action or investigation. Having a system and method to prioritize the entities for further investigation and audit can make more effective use of critical resources provided that the false positive rates are relatively low for the prioritization used.
Commonly-assigned U.S. Pat. No. 6,029,144 to Barrett et al. describes a system and method for checking expense entries for compliance with policy rules and detecting the possibility of fraud. This patent also describes a prioritization for ranking detected policy violations. The prioritization is done by clustering analysis using self organizing map (SOM) neural networks.
U.S. Pat. No. 6,643,625 to Acosta et al. describes a method for selecting a set of loans to audit. Among the criteria used in the described selection process is the notion of an exception rate which is the ratio of the number of exceptions to the number of opportunities for the exception. This simple scoring for exceptions might be considered faulty in that it would rank two entities with the same rate as being equal even though one might have higher counts for both numbers used to the compute the rate. If one entity did have higher counts, correspondingly there would be more evidence that might otherwise suggest an audit be performed.
Using the coded experience of previously-investigated claims, one can apply supervised learning algorithms to generate models that can be used to predict the likelihood of fraud based on various claim attributes. New claims that are predicted as the most likely to be fraudulent would be candidates for audit investigation. Reference in this regard may be made to “Strategies for detecting fraudulent claims in the automobile insurance industry,” Viaene et. al., European Journal of Operational Research, August 2005. In this reference, Viaene et al. apply supervised learning methods to score and rank claims using historical data that includes outcomes of past investigations. However, in many domains and business processes, information on past investigations may be unavailable and this approach cannot be used.
Commonly-assigned U.S. patent application Ser. No. 11/557,520, entitled “Apparatus, System, Method and Computer Program Product For Analysis of Fraud in Transaction Data,” describes techniques for determining if a particular claim is fraudulent by generating a score representing the probability of fraud. These techniques leverage the use of proxies. In some cases, proxies may not be available.
Commonly-assigned U.S. patent application Ser. No. 10/749,518, entitled “Resource-Light Method and Apparatus For Outlier Detection” (U.S. Patent Application Publication No. 2005/0160340), describes a method for outlier detection that can denote each instance in an n-dimensional feature space as a potential outlier.