Various embodiments of this disclosure relate to data stream processing and, more particularly, to executing data stream processing applications in dynamic network environments.
A data stream includes a continuous sequence of numerous data units. Data stream applications are applications that can be applied to data streams to process those data units. Data stream processing systems apply data steam applications to data streams and are used in a wide area of fields in which processing is required, such as finance, social network applications, smart cities applications, sensor network applications, and telecommunications.
Data stream processing systems can be expressed as application graphs that receive as input one or more original data streams and output one or more multiple sink data streams. The vertices of each graph correspond to operators of a data stream application, where each operator performs a function on a data stream being processed.
Existing data stream processing systems operate inside datacenters where all incoming data streams are processed by stream applications deployed in the fixed datacenter cluster. Typically, a stream application is first deployed in a datacenter cluster by mapping the operators of the stream application graph to the datacenter's computation resources. Because a data center is a centralized environment, it is possible to perform such mappings efficiently using existing sophisticated deployment stream optimization approaches. After deployment, the stream application executes in the datacenter by processing incoming original data streams and outputting sink data streams.
Both deployment and runtime stream optimization mechanisms in datacenter clusters typically use centralized control and implicitly or explicitly assume a homogeneous processing and communication environment.