The present invention relates to processing of data units (tuples) for an application using a plurality of processing units.
As embedded computing grows ubiquitous, each embedded object gains the capacity for processing and communicating streams of data. The architecture of current database management systems assumes a pull-based model of data access: when a user (the active party) wants data, she submits a query to the system (the passive party) and an answer is returned. In contrast, in stream-based applications data is pushed to a system that must evaluate queries in response to detected events. Query answers are then pushed to a waiting user or application.
Many stream-based applications are naturally distributed. Applications are often embedded in an environment with numerous connected computing devices with heterogeneous capabilities. As data travels from its point of origin (e.g., sensors) downstream to applications, it passes through many computing devices, each of which is a potential target of computation. Thus, distributed computation is the norm. Emerging applications are appearing in which data, generated in some external environment, is pushed asynchronously to servers that process this information. Some example applications include sensor networks, location-tracking services, fabrication line management, and network management. These applications are characterized by the need to process high-volume data streams in a timely and responsive fashion. A unit of data with streaming or continuous nature is called a “tuple”. Examples of tuples include sensor data such as temperature, stock tick and media data such as audio slice and video slice. A unit of processing of tuple is called an “operator”. Examples of an operator include arithmetic calculation, relational join-operations, among others.
The computer system for the application typically includes processing units and network channels that connect the processing units to form a network where the operators for the application are allocated across the processing units. The tuples arrive at the processing unit in continuous manner, and the processing unit performs assigned operators to the data items. The processing time should not vary to fulfill application requirement. For example, the processing time should be within the specified time budget. However, the processing time of the processing units can change for various reasons. For example, the processing time changes when the resource availability of the processing units, the contents of the tuples change or the tuple arrival rate changes. Thus, the selection of which processing units perform which operators, and to which tuples they apply the operators, needs to be done carefully to fulfill the application requirement.
One solution to the processing unit selection involves moving a virtual machine containing a set of operators to another processing unit with fewer loads. A set of operators for a tuple is performed by multiple virtual machines over multiple computers (processing units). One virtual machine containing operators can migrate to another processing unit when the load of the current processing unit is high. In this way, the technique can balance the loads among processing units. However, the technique cannot change the tuples on which a processing unit applies an operator. Therefore, the balancing is coarse-grained and the response-time of balancing is large because often the operation is stopped while moving operators. Thus, significant overhead can be incurred when balancing loads.
Another solution allows a run-time scheduler to move an operator to another processing unit with fewer loads. A set of operators for a tuple is performed by multiple processing units. A central scheduler allocates operators to the processing units and it can move an operator to a different processing unit at run-time to balance the loads among processing units. However, this approach also cannot change the tuples on which a processing unit applies an operator. Therefore, the balancing is coarse-grained and the response-time of balancing is large because often the processing should be stopped while moving operators. Their technique does not prepare before processing the codes for the operators to be moved to a processing unit so their technique incurs a large overhead when balancing loads. The technique also cannot move an operator which causes the change of the applying order of operators.
In yet another solution, a central scheduler routes a tuple to the processing units responsive for specific operator at run-time. Multiple processing units operate on a set of operators for a tuple. Each processing unit is assigned an operator, and the central scheduler routes a tuple to those processing units. The order of applying operators to a tuple is decided at run-time. The technique can balance the loads among processing units by changing the order and the rate of putting tuples into processing units. However, this approach cannot change which processing unit performs an operator, so it cannot balance loads between processing units.