Recently, as various sensing devices, smart sensing networking technologies, and the like, have been actively proposed, an amount of information generated in a ubiquitous computing environment is massive, and a huge amount of information, rather than being limited to such as in previous simple computing devices, are continuously generated at a very high speed. Also, various types of information have also been successively generated massively in the fields of commercial information analysis, Web information or call record log analysis, and the like. In general, a data aggregation in a massive form is called a data stream, which is an unbounded data set that is progressively generated as time goes, and due to the change in the form of generation of information, simple application of an existing data analysis technology for a bounded data set to sensor data processing, or the like, in a ubiquitous computing environment has a limitation. Thus, recently, interest of database research groups has moved to research into various techniques for effectively processing information in the form of a data stream, and various methods for searching for information included in a data stream and various methods for processing a continuous query with respect to a data stream have been actively proposed.
In particular, research into a big data platform technology for processing big data has been extensively conducted. Among big data platforms, Hadoop platform, an open source, is typical. Hadoop MapReduce provides highly scalable distributed data processing function; however, supporting a low level of interface, Hadoop MapReduce is difficult to program. In order to solve such a problem, Hadoop Pig, a data flow processing system, providing a high level of interface has been developed. Hadoop Pig provides a high level of data flow language called Pig Latin, offers convenience and productivity of big data processing service development to users.
However, the Hadoop platform is a batch-based system, so it has a limitation in processing real-time big data. Thus, a highly scalable big data real-time processing technique supporting a high level of data flow language interface, while providing real-time processing of an exploding data stream, is required.