1. Field
Exemplary embodiments of the present invention relate to a method and an apparatus for optimally processing N sort queries in a multi-range scan. Technologies specifically disclosed in this disclosure are applicable to a wide variety of data storage systems that provide a range scan function with an index as well as a database management system (hereinafter, referred to as “DBMS”).
2. Discussion of the Background
With the development of the Internet, various social networking services (SNS) using the Internet have drawn attention. The SNS collectively means a service that can enhance online peer relations and allow new human relations with unidentified persons. Various types of SNSs with unique characteristics, such as CYWORLD™ of SK communications in Korea, FACEBOOK™ in U.S.A., and the like, have been constantly developed and served.
As an area of the SNS, Microblog has recently been used by many users. The “Microblog” is a type of a blog that uses a short message of one or two sentences to communicate with many people, which is referred to as a “miniblog.” The Microblog allows real time update of information since users communicate with one another through short messages and can post pictures, moving pictures, or the like. That is, the Microblog may be a type in which blog is coupled with a messenger to allow users to feel as if they are using chatting programs. Further, since users create contents regarding their own trivial everyday lives, thoughts or feeling that come to their own minds at ordinary times, their own emotion, their own news, and the like, with a short message and communicate them with one another, they can conveniently use the Microblog without the burden of writing or reading long sentences. As a result, the Microblog has become greatly popular. A representative example of the Microblog may include TWITTER™, ME2DAY™ in Republic of Korea, and the like.
The SNS, in particular, the Microblog updates information in almost real time for the news that is exchanged across many users. Queries according to a scheme for allowing a user or other users (hereinafter, referred to as “friends,” which may be at least several people or tens of thousands of people) that enter into a relation with the user to extract only a portion of the latest information among the communicated information and displaying the extracted information to the user or his/her friends have been frequently used. For example, queries of extracting a predetermined number (for example, N (N is a natural number)) of the most recently created messages that are created by friends or N messages created after a certain specific point of time have been frequently used. The processing of the queries needs to be performed in a multi-range scan type which repeatedly performs the operation of range-scanning only the messages created by the friends after the specific point of time for all the friends. The processing of the queries thus needs to perform the operation of extracting (hereinafter, the queries used for the extraction are referred to as ‘N sort queries’) in a sorted order only the most recently created N messages or after some time among the friends' messages that are accessed through the multi-range scan. However, most of the data storage systems including a DBMS according to the related art does not consider the optimal processing on the N sort query in the multi-range scan that has been mainly used in the SNS, or the like. For example, all the messages of friends that are accessed through the multi-range scan in the conventional DBMS are extracted as interim results and then, these messages need to be sorted in a reverse order of creation time. Therefore, the processing speed may be very slow and a huge storage space for storing the interim results may be needed. Accordingly, the queries frequently performed in the SNS, or the like, may not be efficiently processed only by the function of the conventional DBMS.
The processing scheme according to the related art may exponentially increase all the number of records to be scanned by the DBMS as the number of messages created for each user is increased and as the number of friends is increased. In this case, a considerable amount of memory space for storing an intermediate record set required for sorting is needed and a burden of sorting a large number of records is increased, which leads to an increase in a waste of time and space consumed to process queries.
Therefore, when receiving the N sort queries in the multi-range scan frequently used in the SNS, or the like, a need exists for a scheme for performing the corresponding query processing by using only the storage space having a limited size while minimizing the number of messages of friends to be scanned, that is, a scheme for processing an optimal query in terms of temporal and spatial costs.
Further, the scheme for processing queries is required for a high-rate data repository that provides a range scan function in the existing DBMS and a front stage of the DBMS, that is, a high-rate data repository that provides the range scan with an index for collection of data while storing and managing data only in the memory. For example, the high-rate data repository is included in a type of a database that is focused on processing performance or system scalability while providing a new interface, rather than in a NoSQL database that has been mainly discussed in a recent database, that is, the DBMS that provides the query processing function through an SQL interface.
The above information disclosed in this Background section is only for enhancement of understanding of the background of the invention and therefore it may contain information that does not form any part of the prior art nor what the prior art may suggest to a person or ordinary skill in the art.