This invention relates to a method of setting a distributed stream data processing system into which flow control is incorporated.
In recent years, a stream data processing system capable of summing up and analyzing data in real time has been attracting attention. The term “stream data” used herein refers to a sequence of pieces of data that continuously arrive. In the IoT era, the distributed stream data processing system is expected to be used in order to analyze data obtained from various apparatus for the purpose of system improvement or the like.
In the distributed stream data processing system, a plurality of queries that form analysis processing or the like are arranged in a plurality of computers, and the queries are executed by each of the computers. A generation order of pieces of data and a reception order of the pieces of data may fail to match each other depending on the coupling relationship in a network, the arrangement of the computers, or the like.
The plurality of queries to be executed include a query required to process pieces of data in a time-series order. In a case where the generation order of the pieces of data and the reception order of the pieces of data differ from each other, an incorrect processing result is output. Therefore, in order to guarantee the consistency of processing results, it is required to provide a system for achieving consistency between the generation order of the pieces of data and the reception order of the pieces of data.
As means for achieving the above-mentioned system, there is known a technology described in, for example, US 2011/0093491 A1. As described in US 2011/0093491 A1, in regard to summation processing that can be partitioned in units of groups, an execution module partitions the summation processing based on tuple times, and uses the partitioned times as summation processing times to be used by a computer in the subsequent stage.