The present invention relates to a method of configuring a stream data processing system which allows stream data in real time and an external data to cooperate with each other, and a method of allowing the real-time data and the external data to cooperate with each other in the stream data processing system.
In comparison with a database management system (hereinafter, referred to as “DBMS”) which executes a processing for data stored in a storage system, there is an increasing demand for a data processing system which performs real-time processing on data arriving from moment to moment. For example, in a stock brokerage system, how to quickly respond to a stock price movement is one of the most important points of the system. As in the case of a conventional DBMS, a method of temporarily storing stock data in a storage system and then retrieving the stored data does not enable quick response to the speed of stock price movement, which may result in missing a business opportunity. For example, U.S. Pat. No. 5,495,600 discloses a mechanism of periodically executing a stored query. However, it was difficult to apply the mechanism to the real-time data processing, for which the query needs to be executed as soon as data such as a stock price arrives.
As the data processing system suitable for the real-time processing of the data arriving from moment to moment as described above, the data being defined as stream data, for example, a stream data processing system STREAM is disclosed in “Query Processing, Resource Management, and Approximation in a Data Stream Management System” (Written by R. Motwani, J. Widom, A. Arasu, B. Babcock, S. Babu, M. Datar, G. Manku, C. Olston, J. Rosenstein, and R. Varma, In Proc. of the 2003 Conf. on Innovative Data Systems Research (CIDR), January 2003).
In the stream data processing system, in contrast to the conventional DBMS, queries are first registered in the system. Simultaneously with the arrival of data, the queries are continuously executed. In order to efficiently process the stream data, the system STREAM has introduced a notion called a sliding window for cutting out a part of the stream data. As a suitable example of a descriptive language of the query containing a designation of the sliding window, a continuous query language (CQL) disclosed in “Query Processing, Resource Management, and Approximation in a Data Stream Management System” (ditto) can be given. The CQL is an extended structured query language (SQL), which is widely used in the DBMS, by using parentheses after a stream name in FROM phrase in the SQL for designating the sliding window. As the SQL, the SQL disclosed in “A Guide to SQL Standard (4th Edition)” (Written by C. J. Date and Hugh Darwen, Addison-Wesley Professional; 4 edition (Nov. 8, 1996), ISBN: 0201964260) is known. As representative methods of designating the sliding window, two methods can be cited. To be specific, (1) a method of designating a number of data strings to be cut out, and (2) a method of designating a time interval of a data string to be cut out. For example, “Rows 50 Preceding” described in a second section of “Query Processing, Resource Management, and Approximation in a Data Stream Management System” (ditto) corresponds to a suitable example of the above-mentioned method (1) of cutting out fifty-rows of data as a target to be processed, whereas “Range 15 Minutes Preceding” corresponds to a suitable example of the above-mentioned method (2) of cutting out fifteen-minute of data as a target to be processed. The stream data obtained by cutting out with the sliding window is retained on a memory to be used for the query processing.
The stream data processing system is expected to be used for the applications requiring the real-time processing, representative examples of which are a financial application, a traffic information system, and computer system management.