Against a background that there is a growing demand for analyzing information that continuously occurs at a high rate in real time like the automation of stock trading, the upgrading of traffic information processing and the analysis of click streams, and for promptly taking action against the occurrence of important events, stream data processing that realizes real-time processing of high-rate data attracts attention. Since the stream data processing is a general-purpose middleware technique applicable to various kinds of data processing, it is possible to reflect data of the real world to the business in real time while responding to a rapid change in a business environment that is not supported if the system is constructed for each case.
A stream targeted by the stream data processing is time-series data in which a tuple that is data with a time stamp continuously comes. When the user of the stream data processing defines a monitoring rule with respect to this stream as a query, query definition is converted into a query graph. The query graph is a digraph in which a processing unit called an operator is assumed to be a node and a tuple queue between the operators is assumed to be an edge. By causing individual tuples configuring an input stream to pass the query graph, processing is advanced like a data flow. Since it is processing of the data flow type, by performing multiple division of the query graph and performing parallel processing in a pipeline manner by multiple calculation resources, it is possible to improve the throughput.
Meanwhile, even for the time from the occurrence of an event to the generation of action, that is, even for latency, a very strict request of the millisecond-to-microsecond order is imposed. Therefore, in the stream data processing, it is an important technical issue to make the latency performance and the throughput performance compatible.
There is JP 2010-204880 A (PTL 1) as a background art of this technical field. This publication discloses “Throughput related to query processing of stream data of a stream data processing system is improved. When a data delay with respect to a query group that groups queries occurs, a scheduler of a server device calculates the query load evaluation value of each query configuring the query group on the basis of at least one information of input flow rate information and latency information, divides the queries configuring the query group into multiple query groups such that the sums of query load evaluation values are substantially equal to each other, and reassigns the divided multiple query groups to respective processors” (see the abstract of PTL 1).
Moreover, there is JP 2008-146503 A (PTL 2). The publication discloses “There is provided a multiprocessor system including a processor unit (PU) for control, multiple subprocessor units (SPU) for operation, each of which has a local memory, and a main memory. In a multi-task environment in which multiple tasks are executed in parallel by performing time division of the calculation resources of each SPU and assigning them to multiple tasks, an operating system that operates on multiple SPUs includes: a function of constructing a pipeline processing system to execute specific processing including multiple tasks of different loads by giving an execution result of a task to other tasks and operating the pipeline processing system multiple times; and a function of loading a task whose context is saved to the main memory and which is in a ready condition to the local memory of an SPU in which any task is not executed, and executing the task” (see the abstract of PTL 2).
Moreover, there is JP 2010-108152 A (PTL 3). This publication discloses “There is provided a stream data processing method and system that can realize general data processing including recursive processing at a low latency. The stream data processing system constructs a single operator graph from the execution tree of multiple queries, decides the operator execution order such that the execution of stream computation is advanced in one direction from an input to an output, and monitors the ignition times of an external ignition operator that inputs external system data and an internal ignition operator that generates data in a time-limited manner, and an operator execution control unit assumes an operator of the earliest ignition time as a base and repeats processing that concludes processing in the operator graph of the time according to the decided operator execution order” (see the abstract of PTL 3).