Data analytics, such as in the field of complex event processing (CEP), may require real-time performance in order for responses generated from received data to remain timely and relevant. For example, in the field of retail, a system operated by vendor can generate shopping information (e.g., advertisement of a product, locate a product in a shop, etc.) for a potential customer based on location information of the person. Often times, the system must provide the shopping information to the customer in a timely manner (e.g., while the person remains at the retail location) to avoid the shopping information becoming stale and/or irrelevant.
Also, real-time processing of data from multiple, different sources may be needed to generate a response. Using the retail example above, the system may receive identification information, preference information, location information of the potential customer, product advertisement information, etc. from different data sources, and process the information in real-time to generate the afore-mentioned shopping information. Each of the data sources typically stores multiple records, with each record representing an association among at least some of the information. For example, one of the data sources may store records that associate customers with their location information, while another data source may store records that associate location information with product information. To generate the additional information (e.g., the response with shopping information) as described above, the system may need to acquire these records from different data sources in real-time.
A conventional system typically accumulates records before generating additional information. For example, in the retail environment above, a conventional system may accumulate records for all of the potential customers at a specific location before passing the accumulated records on for further shopping information generation.
The inventors here have recognized several technical problems with such conventional systems. First, by delaying generation of additional information until the records are accumulated, response time is increased. Further, these delays can also depend on the number of records to be accumulated, adding unpredictability to response times. Moreover, such an arrangement fails to leverage multi-thread computing resources efficiently by forcing thread execution (e.g., for transmitting the records or for generation of additional information) to be put on hold until the records are accumulated, even when there are other threads available for execution of the transmission of the record and/or the generation of the additional information.