1. Field of the Invention
The present invention relates to artificial intelligence, and more particularly to business insider threats detectable by automated system administrator behavior analysis.
2. Description of Related Art
Big business is now routinely controlled by large computer systems and networks that operate on a scale so vast and quick as to be incomprehensible to average people. These computer systems and networks are, in turn, steered, monitored, and cared for by system administrators that have super access to all parts.
If the access privileges that system “admins” hold are abused, major economic and security damage can be caused all too silently to a company.
It has not been lost on investigators and analysts in general that evildoers and other perpetrators will behave in unusual ways, especially during the moments leading up to the commission of a crime. With computer systems, bad people often get away with impersonating real authorized users, but their odd behaviors will give them away.
So too with business insiders who have authorized access, but then abuse their privileges. But a special case is presented with system administrators because their behaviors are normally very chaotic, and patterns of normal behavior are absent even when their privileges are not being abused.
Twenty-three years ago Krishna Gopinathan, et al., proposed an automated system for fraud detection using predictive modeling. See, U.S. Pat. No. 5,819,226, filed Sep. 8, 1992. Neural networks were trained with historical and past transactional data, and used thereafter during operation to identify suspicious transactions based on learned relationships among the known variables. Their system periodically monitored a compliance metric of its fraud detection rate and its false positive rate. When their compliance metric fell below a minimum value, the system would automatically redevelop and adapt the fraud model.
What's not clearly disclosed is that only one model is ever developed and redeveloped from all the past transactional data to fit all the cardholders. Models are not individually created and assigned to track individual cardholders. That would work alright if the cardholders were fungible, but they're not, and each individual expresses sometimes unpredictable independence.
A “profile record” is created for cardholders by Krishna Gopinathan, et al., using the previous month's authorizations and cardholder data. Updates of individual cardholder activity use previous profile-record values and the previous month's authorizations and cardholder data. A “cascaded operation” adds a second neural network model trained only with transactions that achieved fraud scores from the first neural network model. Evidently cascades of three or four levels are possible.
Krishna Gopinathan, et al., provide a flowchart (FIG. 16) of a real-time system using the profile database. Upon receiving a merchant's request for authorization on a transaction 1602, the system obtains data for the current transaction 1603, as well as profile data summarizing transactional patterns for the customer 1604. It then applies this data to the stored neural network model 1605. A fraud score (representing the likelihood of fraud for the transaction) is obtained 1606 and compared to a threshold value 1607. Steps 1601 through 1607 occur before a transaction is authorized, so that the fraud score can be sent to an authorization system 1608 and the transaction blocked by the authorization system if the threshold has been exceeded. If the threshold is not exceeded, the low fraud score is sent to the authorization system 1609. The system then updates a customer profile database 806 with the new transaction data 1610. Thus, in this system, profile database 806 is always up to date (unlike the batch and semi-real-time systems, in which profile database 806 is updated only periodically).
The customer data from database 806 typically includes general information on the customer; data on all approved or declined transactions in the previous seven days; and, a profile record of data describing the customer's transactional pattern over the last six months. The general information on the customer typically includes customer zipcode; account open date; and card expiration date. Each profile record a profile database summarizes the customer transactional patterns as moving averages. The profile records are updated periodically, e.g., monthly, with all the customer transactions from the period.
Periodic redevelopment of the models makes it sound like the system can self-adapt. But their system constantly needs ever-improving training data that may not exist. The only diversity amongst the cardholders is in their respective transactions, not the fraud models being applied to them. Compliance only initially reaches optimum, and falls off immediately. Worse, each subsequent model redevelopment costs time offline. New kinds of fraud that evolve will disrupt such models because they're not equipped to evolve in tandem.
The short comings with these neural network models is needing to know what output is desired for each input before any training begins. Such can be very limiting. During training, if any of the desired outputs are left unknown for some input patterns, new incidences of fraud and abuse will go undetected in real-time. Detection that lags infection will exact a cost.
Neural networks, statistical modeling and profiling have been applied to fraud and abuse detection. But for them to be effective, they need a large database of cases in which fraud and abuse were detected. However, for this to work later the fraudulent methods and abuse must not have changed much. Such tools are impotent when the fraud and abuse either too closely resembles normal activity, or if it constantly shifts as the fraudsters adapt to changing surveillance strategies and technologies.
Conventional analytic solutions, even those that transaction to be non-hypothesis based, still operate within very rigid boundaries. They are either designed or tuned to look at various scenarios in such a way that they will only catch a limited range of the leakage problem. When something truly surprising happens, or a variation occurs that was not anticipated, systems based on such models fail to complete.
Modern systems need to be sophisticated, unsupervised, and learn as they go. New behaviors of fraud and abuse arise daily.
Conventional solutions to fraud have obtained only mediocre results. They lack scalability and always require high manual effort. We can do better.