1. Field of Invention
The invention generally relates to a method for processing multiple continuous Top-K queries.
2. Description of Prior Art
Recently, monitoring application is an interesting research in a wider field. In data stream, most of the applications, such as wireless sensing network or utilization rate analysis for a network all need to be processed by continuous queries and need to report results. In such an application, it is not useful for transmission of a large amount of the data stream to a centralized processing system and for an extremely long response time. In past, most of algorithms are concentrating on an one-time query, and are not adapted to process multiple of monitoring simultaneously because they are not possible to check the result which varies. Of course, it is possible for these algorithms to simulate efficacy of the monitoring by repetition of operations. However, supposed that the result did not vary, the repeating operations were all wasted. Even if there is a proposed algorithm which may process continuous queries, it still has problems from lack of information sharing and a mechanism for lowering heavy loading. For solving these problems, the present invention has disclosed a method for processing multiple continuous Top-K queries.
In the beginning, the system always has no information for sharing. If the system is requested for processing any queries, the system will report the current results by using Fagin algorithm and establish a RLT (Ranked List Table). Subsequently, if any server in the system finds out the variation of the RLT occurred, the server will utilize one of three operations to correct the accuracy of the varied RLT. However, such a case does not happen frequently, sometimes several predetermined threshold being exceeded can be happened. If there is the RLT sharing information, the system can process any of new queries by employing the RLT information or the method combined RLT and the Fagin algorithm. Finally, it should be taken into consideration that the system must report the accurate results continuously. When any server in the system found that the result will possibly varies, the system will employ a scheme of access ordering again that is performed in between the RLT and servers in order to reduce data transmission and achieve a faster response.
Three algorithms proposed in documents 1 to 3 are all focusing on the one-time Top-K query. When these algorithms for processing continuous Top-K query are used, the system can only perform these algorithms repeatedly to obtain a new Top-K result because there is no any mechanism to monitor the variation of the Top-K results at present. To repeat the operation of these algorithms always waste the system resources and increase the system loading. In particular, when the Top-K result has no variation, the repeat operations for these algorithms are all redundant. Thus the frequent operations will be an adverse factor for query. Whereas, if lower the frequency of repeated using these algorithms for operations, the system can not respond in time when the result of Top-K varies. Therefore, for the continuous Top-K query, these above-described algorithms have significant disadvantages.
In a method of document 4, there is a proposed mechanism for a single continuous Top-K query. This method is focusing on a single query, so that when the system received a plurality of Top-K queries, all of the relevant steps should be processed repeatedly whether the requests are coming simultaneously or in order. The proposed method doesn't teach any mechanism which has information sharing scheme for processing data in advance before starting the process for continuous query. In addition, when the proposed method is used for processing a continuous Top-K query, the system should receive data from all associated nodes (i.e, servers) which need to support the data. For a distributed networks system, this method which needs to receive all of related data is not a practical manner.
[Reference Documents]
    1. R. Fagin, “Combining fuzzy information from multiple systems”, in J. comput. System Sci., pages 58:83-99, 1999.    2. R. Fagin, A. Lotem, and M. Naor, “Optimal aggregation algorithm for middleware”, in Symposium on Principles of Database System, 2001.    3. P. Cao, Z. Wany, “Efficient Top-K query calculation in distributed networks”. In PODC, 2004.    4. B. Babcock, C. Olston, “Distributed Top-K monitoring”, In PODC, 2003.
In terms of the above-described problems, the inventor has found a complete mechanism and algorithm for solution of multiple Top-K queries. In general, the data transmission and response can be improved largely.