Analysis techniques for large data sets have been loosely categorized into batched and real-time systems and methods. Both batched and real-time systems collect data as the data appears with varying amounts of latency. However, batched and real-time systems subsequently query the data is different ways. There are advantages and disadvantages to both systems that can be appropriately applied to various types of problems. For example, an advantage of real-time systems is that the results of a query are updated immediately and are available even as new data is flowing into the system. However real-time systems may need to determine the exact set of statistics that are being updated beforehand. A batched system has the advantage of allowing any ad-hoc query to be made, but the results of that query will be returned according to the computational resources available to the system. Simpler queries will take less time to return results than more complex queries. However, a complex query can take an indeterminate amount of time to return results. Another problem is that the batched system query is being run on a view of the data at a particular time, without being updated by new data coming into the batched system unless the query is re-run.
What is needed is an improved hybrid system and method that includes features of both batched and real-time systems and methods.