Information can be culled from events that occur in relation to large, streamed data sets. However, it can be impracticable or impossible to store a streamed, large data set or associated event data, and similarly impracticable or impossible to determine the most frequently occurring elements (also referred to as top-k) within the large, streamed data set or its associated event data.
Conventional methods and systems employ probabilistic solutions to store selected event-data associated with large, streamed data sets. However, such conventional methods and systems do not enable handling querying about arbitrary, selectable intervals (e.g., time intervals) associated with the event data. For example, some conventional methods and systems have the capability of providing top-k information about the event data for infinite time intervals only.
Such conventional methods and systems have generally been considered satisfactory for their intended purpose. However, there is still a need in the art for handling top-k queries that request top-k information for an arbitrary, selectable interval (e.g., time interval) about event data associated with a large, streamed data set, including when the event data is stored using a probabilistic solution. The present disclosure provides a solution for these problems.