The present invention relates to discovering unusual, unexpected or anomalous information and trends in high throughput data streams and databases, and more specifically to using probabilistic surprisal context filters to discover unusual, unexpected or anomalous information and trends in high throughput data streams and databases.
Discovering unexpected information and trends in high throughput data streams and ultra large data structures is very difficult. It is especially problematic to do so in a manner that approximates real time. The unexpected information and trends are especially useful to decision makers. The unexpected information and trends cannot be found through data mining, classic queries or big data stream processing. Big data being defined as data that exceeds the processing capacity of conventional database systems, where the data is too big, moves too fast, or does not fit the structures of common database architectures.