Predictive models for characterizing whether a certain data transaction, such as an authorization for a credit or debit card payment, is indicative of fraud typically base such decisions on a plurality of inputs. These inputs can, for example, comprise continuous (e.g., any value within a range), binary (e.g., true/false), or categorical variables (e.g., merchant code, employee number, etc.).
Conventional predictive models have difficulties characterizing ‘cross’ interactions between categorical variables and other continuous or binary variables. Cross interactions in this context refers to that the risk function conditioned on the other variables is significantly different as a function of the categorical variable. For instance, transactions for in-home domestic services, such as carpet cleaning, could conceivably have a substantial probability of being fraudulent if they occur in foreign countries far from the cardholder's home, since most cardholders use these purchases for their own homes. However, transactions for tourist-oriented travel and entertainment services may well be legitimate if they take place overseas. Similarly, risky transaction amounts will depend on the type of merchant as well.