With the increase in communications and electronic transactions, incidents of fraud surrounding these activities has increased. For example, “cloning” a cellular telephone is a type of telecommunications fraud where an identifier, such as a serial number, for a cellular telephone is snooped, or read, as calls are transmitted, captured, and used to identify calls transmitted by other cellular telephones. When the other cellular telephones transmit calls, the calls may be fraudulently charged to the account holder for the original cellular telephone.
Another fraudulent activity includes stealing credit card numbers. Some workers carry small readers for reading the vital information from a credit card. A person may get a job as a waiter or cashier in a restaurant and when the customer provides his credit card, the credit card may be swiped as part of payment and swiped again using the small reader. The credit information is captured and then the person misappropriating the credit card information will use the information to make unauthorized purchases, or sell the information related to the credit card to others who will place unauthorized purchases. There are other schemes where a group of bad actors set up bogus automated teller machine (ATM) machines. In one instance, a convenience store owner was given $100 to allow a bogus machine to be placed in the store. The automated teller machine (ATM) included a reader only so prospective customers would use the machine and then complain that it did not dispense money. The bad actor would pick up the machine after several days and take it for “repair” and would never return. The misappropriated credit card numbers would then be either sold or used to make various purchases.
In short, various fraudulent schemes result in large losses to various institutions. Generally, the losses are billions of dollars per year. Therefore, there is large demand for systems and methods to detect fraudulent transactions. Some current systems and methods attempt to detect fraudulent transactions by constructing a model based on historical observations or transactions. By observing a large number of transactions, characteristics of fraud may be derived from the data. These characteristics can be then be used to determine whether a particular transaction is likely to be fraudulent.
For example, characteristics of 100,000 transactions, such as phone calls or points of sale, can be captured and later characterized as fraudulent or legitimate. The fraudulent calls in the 100,000 calls may share similar characteristics and transaction patterns that are used to build static model that indicate the probability of fraud for incoming transactions. Similarly, the fraudulent credit card transactions in the 100,000 transactions may share a different set of similar characteristics and transaction patterns. The similar characteristics, in either case, are used to build static model that indicate the probability of fraud for an incoming transactions, such as transactions associated with phone calls, point of sale transactions, internet sales transactions, and the like. In certain systems, these static, historical models can be used in a production, or real-time, environment to evaluation a probability of fraud for incoming transactions. However, creation of the historical model may be difficult to deploy.
The models formed for production generally include an indication of fraudulent or non-fraudulent activity. Historical transaction data is reviewed for variables that give indications of fraudulent activity. The variables are generally associated with one another in building a model on which to base future predictions of fraud can be made. One of the fundamental challenges in modelling is variable selection or finding a group of variables that will predict the fraud best. Theoretically, each variable group out of all the available variables may be evaluated by some objective criterion such as accuracy of model prediction on a test data set, and the best variable group may be chosen to build the final model. Such an exhaustive search for a variable group uses an inordinate amount of resources and becomes largely non scalable and impractical for large number of variables. This approach takes up large amounts of computing time and requires high amounts of computing resources, such as use of several parallel processors. One common and almost universally practiced class of methods for variable selection is termed a greedy method. In this, the top variables incrementally are combined in the model. In the forward selection method, a particular greedy method, the top variable is first selected as a starting variable. The remaining variables are re-evaluated and then the variable having the highest value, given all the previous variables selected so far, is selected as the next variable in the model. Although fast and practical, this has the obvious disadvantage of giving a potentially local maxima depending on the first variable selected. Another fundamental problem with trying to find a single variable group to find fraud is that it assumes that there is only one reason for the fraud as only one group of variables is sufficient to predict it. This again, is a fallacy because there can be several possible reasons for fraud and different variable groups might indicate different possible reasons for fraud. Thus, there is a need for another method of selecting variables for model building, one that avoids the pitfall of finding a local maxima by being so greedy as the forward selection method and at the same time allows for finding multiple variable groups, each capturing a different form of fraud.