A number of applications today need to manage data that are imprecise. For example, imprecisions arise in fuzzy object matching across multiple databases, in data extracted automatically from unstructured text, in automatic schema alignments, in sensor data, and in activity recognition data. Clearly, there are many other types of imprecise data. In some cases, it is possible to eliminate the imprecisions completely, but this approach is usually very costly, like manual removal of ambiguous matches in data cleaning. In other cases, complete removal of imprecision is not even possible, e.g., in human activity recognition.
Modern enterprise applications are forced to deal with unreliable and imprecise information, but they can often tolerate such imprecisions, especially in applications like search or business intelligence. However, a system that tolerates imprecisions needs to be able to rank query results based on the degree of their uncertainty. It would therefore be desirable to develop techniques to automatically manage imprecisions in data, and also, to rank query answers according to the probability that the answers are correct. A technique for efficiently accomplishing such a task is not currently available.