1. Technical Field
The present invention relates to identifying at least one property of data. More particularly, the invention concerns an intelligently interactive system for identifying at least one property of data.
2. Description of Related Art
It is frequently useful to profile data. For example, data may be profiled to determine the expected risk of fraud in a credit card transaction, or the risk of terrorism that a freight shipment poses, or the risk that a patient has a serious medical condition. Data profiling can also be applied to ascertain the chances that a viewer will enjoy a movie, the chances that a person will be compatible with another person in a dating service database, or the chances that a stock will go up or down as the result of set of economic conditions.
Known methods of profiling data may involve applying behavioral rules prescribed by human experts to the data. As an example, a behavioral rule could assign a high level of risk of fraud to a credit card transaction, if the credit card used for the transaction has been reported lost. As another example, a behavioral rule could assign a high level of terrorism risk to a shipping container, if a high level of radioactivity is measured outside the container.
One shortcoming of using only behavioral rules prescribed by human experts for profiling data is that the experts may have insufficient knowledge to prescribe rules. Other shortcomings of using only rules prescribed by human experts are that the experts may erroneously prescribe incorrect rules, or may prescribe conflicting rules. Another shortcoming is that humans typically cannot quickly develop rules, and are slow to react when there is a need to change rules. Yet another shortcoming is that over time, the number of rules prescribed by human experts may grow very large and may require a long time to process, which could result in the profiling being too slow for many applications. For example, a method for profiling data to determine the risk of fraud for a credit card transaction must be able to be completed within several seconds in order to be useful for many applications. Another shortcoming of using only rules prescribed by human experts is that some of the rules may be difficult or impossible to implement. Existing methods for profiling data have additional shortcomings, such as not having an automatic feedback loop for improving the rules prescribed by human experts, and not being reactive or proactive to the user.
Existing methods for profiling data that utilize machine learning in the form of neural networks merely function as black boxes that produce an output, and also are not reactive or proactive to the user. The lack of user feedback in these methods limits the accuracy of the results, and limits the capability of these methods to adapt to changed circumstances or to correct errors. Further, existing methods that utilize machine learning rely excessively on supervised learning, which may limit the accuracy and usefulness of the results in cases where feedback is limited or nonexistent.
In summary, existing methods for profiling data are inadequate for many applications.