Recent advances in networking and information technology have boosted the development of new and advanced services offered over communication systems that integrate a widely heterogeneous mix of applications and computer devices. Without careful traffic control and resource management, the dramatic increase in demand for networking resources and remote application services may lead to substantial degradation of the Quality of Service (“QoS”) as experienced by end users.
For example, as a result of rapid advances in computer technology and wireless communications, a new set of streaming applications flourish in a number of fields such as financial analysis, system diagnosis, environmental monitoring, and mobile services. These applications typically involve filtering, aggregation and processing of high-volume, real-time and continuous data across a large number of interconnected devices. Distributed data management has emerged as an appealing solution in response to these applications. In recent years, a number of distributed Data Stream Management Systems (DSMSs) have been developed, see, for example, Borealis [1], Medusa [11], GATES [10], IrisNet [15] and SPC [16].
Most queries in these DSMSs are persistent queries that continuously output results as they are produced. The rates at which data arrives can be bursty and unpredictable. Consider, for example, a disaster sense and respond system that monitors and detects certain disaster events. When the events happen, the data rates can dramatically increase and it is important that relevant data be delivered and processed in a timely fashion. In this example, the relative importance of output data can be used for QoS specification. Such QoS can be measured in throughput, delay or general utility functions of these metrics. Different users/applications may specify the QoS requirements differently and must always try to maximize the total delivered QoS [1]. With the unpredictable and bursty nature of the arrival process, the admission rates can create a load that exceeds the system capacity during times of stress. Even when the system is not stressed, in the absence of any type of control, the initiation of the various streams is likely to cause congestion and collisions as they traverse interfering paths from the plurality of sources to the sinks. The system must therefore employ effective load shedding and resource control mechanisms so as to optimize the operating environment. In general terms, load-shedding is the process of admission control where excess load is dropped so that input streams can be processed within QoS requirements. Inside the stream processing system, the resources that require intelligent management and control include storage, processor cycles and communication bandwidth.
Accordingly, the need for improved stream processing methods and apparatus is becoming increasingly apparent with the proliferation of applications that require sophisticated processing of data generated or stored by large numbers of distributed sources (such as data streams generated from sensor networks, financial feeds, traffic monitoring center or other real-time enterprises). In such applications, continuous flows of data are brought into the stream processing environment in the form of streams. Various processing units are instantiated to analyze the data—potentially annotating the data, transforming the data, or synthesizing new data for further processing, and publishing the data to output streams or storage. Such processing/analyses are required to be performed on the fly, often with little or low tolerance for delay, in order to enable real-time responses. The requirements to process, store, maintain and retrieve large volumes of mostly real-time (continuous/streaming) data at a high rate, pose great design challenges for efficient stream processing systems.
Resource allocation problems encountered in stream processing systems have been considered heretofore without satisfactory resolution. Multiple data streams flow into the stream processing system to be processed and eventually to lead to valuable output. Examples of such processing include matching, aggregation, summarization, etc. Each stream requires certain amount of resource from the nodes to be processed. The nodes need to decide how much flow to admit into the system. The overall objective is to maximize a system utility function, which is a concave function of the amount of processed flow rates.
As the physical network can be large and distributed, it is difficult and unrealistic to look for a centralized solution. As stream processing systems grow larger in size, applications are often running in a decentralized, distributed environment. At any given time, no one entity has global information about all of the nodes in the system. The actions of one node may inadvertently degrade the performance of the overall system, even if the nodes greedily optimize their performance. It is thus difficult to determine the best control mechanism at each node in isolation, so that the overall system performance is optimized. In addition, the system must adapt to dynamic changes in network conditions as well as input and resource consumption fluctuations. The system needs to coordinate processing, communication, storage/buffering, and the input/output of neighboring nodes to meet these challenging requirements. Dynamically choosing when, where and how much load to shed and coordinating the resource allocation accordingly is therefore a challenging problem.
As a result, those skilled in the art seek improved methods and apparatus for controlling stream processing networks. In particular, those skilled in the art seek methods and apparatus that overcome the limitations of current centralized stream processing control methods. For example, those skilled in the art seek methods and apparatus for controlling load shedding and resource allocation in stream processing networks that can operate without centralized control. It is not enough merely to control load shedding and resource allocation in other than a centralized manner. Those skilled in the art seek methods and apparatus that achieve near-optimal or optimal load shedding and resource allocation decisions with reasonable convergence behavior.