The invention generally relates to monitoring the behavior of rules responding to input data received within a production environment. More specifically, it relates to detecting when previously defined rules behave anomalously in a production setting. Typically but not always the rule sets are created through machine learning on training data.
In many environments, a large amount of data can be or has been collected which records experience over time within the environment. For example, a healthcare environment may record clinical data, diagnoses, and treatment regimens for a large number of patients, as well as outcomes. A business environment may record customer information such as who they are, what they do, and their browsing and purchasing histories. A computer security environment may record a large number of software code examples that have been found to be malicious. A financial asset trading environment may record historical price trends and related statistics about numerous financial assets (e.g., securities, indices, currencies) over a long period of time. Despite the large quantities of such data, or perhaps because of it, deriving useful knowledge from such data stores can be a daunting task.
Existing anomalous behavior detection techniques require manually-created rules based on human observation of desirable or undesirable human behavior. There are at least two different kinds of behaviors to address: independent and contextual. Independent behaviors can be evaluated for normalcy independent of any other instances of behavior. In other words, no historical pattern is needed to determine whether the behavior is expected or unexpected, normal or abnormal. For example, detecting security breaches in a computer network may include looking for multiple login tries within a short period of time. Such behavior can be considered suspect without considering whether the user often mistypes their password. Another example is a credit card fraud detection system that may look for purchases of expensive jewelry on a credit card without considering the frequency with which the account holder buys jewelry.
Contextual behavior is evaluated for normalcy based on conformity to an established pattern of behavior. Each independent action is neither good (normal) nor bad (abnormal). Normalcy may be defined in terms of difference from an established pattern. For example, a credit card fraud system may use information about a card holder and purchase history to determine if a purchase is out of the ordinary for that card holder.
Existing systems that perform anomalous behavior detection evaluate the normalcy of human behavior. Usually, human behavior has intent and motive that can be anticipated. For example, a user in a computer network might conduct a denial of service attack, and a network security system would have hand-crafted rules to monitor network traffic patterns to detect such an attack. Attempting to detect and prevent undesirable behaviors usually involves manually crafted rules that encode conditions that have been observed in the past as correlated with the undesirable human behavior, such as a rule that flags a potentially fraudulent purchase when a point-of-sale (POS) purchase is made at a location more than a certain number of miles away from the home address. The rule may flag anomalous behavior when conditions are inconsistent, such as when two POS transactions occur within an amount of time that is less than how long it would take to travel from the first POS location to the other. Humans observe the correlation between certain data values in a credit card transaction and the likelihood of fraud, and humans encode these observations into rules for monitoring human behavior.