The present invention relates to stream data generating methods, stream data generating devices, and a recording medium storing stream data generating program, and more particularly, to a stream data generating method, a stream data generating device, and a recording medium storing stream data generating program for generating stream data in a stream data processing system.
In these years, a demand for a stream data processing system, which receives a large quantity of data (stream data) incoming at all times and processes the received data on real time basis, is increasing. For example, with respect to a financial application program for supporting stock transaction, one of most important objects of the application is to quickly cope with a variation in stock price. In this connection, when a prior art database management system (DBMS) processes data, the system is required to store the received stock data once in a storage. If the system treats a larger quantity of stock data in future, then it may possibly become difficult for the system to cope with a variation in stock price or the like on a real time basis.
Further, when an application program for processing such stream data on a real time basis is separately created, this involves problems with a prolonged development term, an increased development cost, and difficult quick coping with a change in business using the application. To this reason, a general-purpose stream data processing system has been demanded.
In the stream data processing system, a query (inquiry) is first register in the system, and the query is continually executed together with arrival of stream data. However, since such stream data arrives from time to time, it is impossible for the system to start processing all the data after already arrived. Further, the data arrived in the system are required to be processed according to their arrival order while not influenced by a data processing load.
In a technique disclosed in R. Motwani J. Widom, A. Arasu, B. Babcock, S. Babu, M. Datar, G. Manku, C. Olston, J. Rosenstein, and R. Varma: “Query Processing, Resource Management, and Approximation in a Data Stream Management System”, In Proc. of the 2003 Conf. on Innovative Data Systems Research (CIDR), January 2003; a concept called a sliding window (which will be referred to merely as “window” hereinafter) that stream data are processed on a real time basis by specifying a width of time such as latest 10 minutes or a width of a streams count such as latest 1,000 streams and partly cutting the data streams with such a width, is introduced.
The aforementioned technique also discloses CQL (Continuous Query Language) which can specify a window as a language for describing a query for data acquisition. CQL is an extension of SQL (Structured Query Language) widely used in DBMS, enabling specification of a window. More specifically, techniques or the like utilizing CQL are disclosed, for example, in JP-A-2006-338432 and so on.
Since stream data are data incoming in large quantities from time to time, the stream data processing system, in some cases, cannot process such large quantities of data at a time. To avoid this, when stream data is stored in a plurality of queues, the system acquires stream data on the basis of queue status information so as not to lower the load of the entire system, as disclosed in JP-A-2008-83808. Further, a technique for avoiding reduction of the system processing capability by thinning stream data in the course of processing the stream data in a stream data processing system, is disclosed in Emine Nesime, Tatbul: “Load Shedding Techniques for Data Stream Management Systems”, Ph. D, Brown University, May 2007. P 17-18, chap 3.2.