1. Technical Field
The present invention relates to resource allocation in distributed stream processing, and more particular to systems and methods for more efficiently allocating system resources in distributed networks or systems.
2. Description of the Related Art
The rapid development of computer technology has enabled streaming applications to emerge from many areas of the information technology (IT) industry. Examples of the broad application areas include network traffic monitoring, real-time financial data analysis, environmental sensing, surveillance, etc. In these applications, multiple data streams are generated by a large number of distributed sources, such as news and financial data feeds, monitoring sensors and surveillance video/audio sources. In order to provide valuable services to the users, these continuous streams of data need to pass through a series of sophisticated processing operations, including filtering, aggregation, and translation, etc.
Large distributed computer systems are often built to provide scalable processing services for the large amount of continuous data. The distributed stream processing systems, together with the data sources, and other shared resources, such as Internet hosts and edge servers, are connected by a network and collectively provide a rich set of processing services. The requirements to process large volumes of real-time data at high rates, and to effectively coordinate multiple servers in a distributed environment, have led to many great design challenges for efficient stream processing systems.
A fundamental problem in such stream-based processing networks is how to best utilize the available resources and coordinate seamlessly so that the overall system performance is optimized.
In most stream processing systems, applications are often running in a decentralized, distributed environment. At any given time, no server has the global information about all the servers in the system. It is thus difficult to determine the best resource allocation policy at each server in isolation, such that the overall throughput of the whole system is optimized. In addition to reaching the global optimal performance, the system needs to be able to adapt to the dynamically changing environment including input and resource consumption fluctuations. The system needs to coordinate the processing, communication, storage/buffering, and the input/output of neighboring servers to meet these challenging requirements.
Much stream processing related work has been done on data models and operators and efficient processing. While resource management has been noticed to be important to the performance, most work has focused on either heuristics for avoiding overload or simple schemes for load-shedding. The problem of dynamic load distribution and resource allocation has not yet been fully studied.