1. Technical Field
The present disclosure relates to the data stream processing, and more particularly to data processing using parallel elastic operators.
2. Discussion of Related Art
As the world becomes more interconnected and instrumented, there is a deluge of data coming from various software and hardware sensors in the form of continuous streams. Examples can be found in several domains, such as financial markets, telecommunications, surveillance, manufacturing, and healthcare. In all of these domains, there is an increasing need to gather, process, and analyze these data streams to extract insights as well as to detect emerging patterns and outliers. More importantly, this analysis often needs to be performed in near real-time.
Stream computing is a computational paradigm that enables carrying out of analytical tasks in an efficient and scalable manner. By taking the incoming data streams through a network of operators placed on a set of distributed hosts, stream computing provides an on-the-fly model of processing. The frequent need for handling large volumes of live data in short periods of time is a major characteristic of stream processing applications. Thus, supporting high throughput processing is an important requirement for streaming systems. It requires taking advantage of multiple host machines to achieve scalability. This requirement will become even more prominent with the ever increasing amounts of live data available for processing. The increased affordability of distributed and parallel computing, thanks to advances in cloud computing and multi-core chip design, has made this problem tractable. However, this requires language and system level techniques that can effectively locate and efficiently exploit parallelization opportunities in stream processing applications.