The present disclosure relates generally to scheduling work in a stream-based distributed computer system with configurable networks, and more particularly, to systems and methods for deciding which tasks to perform in a system including fractionally assigning processing elements in those tasks to processing nodes, and measuring the utility of streams produced by the tasks.
Distributed computer systems, especially those with configurable networks and which are designed specifically to handle very large-scale stream processing jobs, are in their infancy. Several early examples augment relational databases with streaming operations. Distributed stream processing systems are likely to become very common in the relatively near future, and are expected to be employed in highly scalable distributed computer systems to handle complex jobs involving enormous quantities of streaming data.
In particular, systems including tens of thousands of processing nodes able to concurrently support hundreds of thousands of incoming and derived streams may be employed. These systems may have storage subsystems with a capacity of multiple petabytes. Some of these systems, which include the focus of this invention, are expected to have networks that are configurable, in order to deal with the dynamic nature of the streams in the system.
Focusing on the scheduling of work in such a streaming system, it is clear that an effective optimization method is needed to use the system properly. Consider the complexity of the scheduling problem as follows.
Referring to FIG. 1, a conceptual system is depicted for scheduling typical jobs. Each job 1-9 includes one or more alternative directed graphs 12 with processing nodes 14 and directed arcs 16. For example, job 8 has two alternative implementations, called templates. The nodes 14 correspond to tasks (which may be called processing elements, or PEs), interconnected by directed arcs (streams). The streams may be either primal (incoming) or derived (produced by the PEs). The jobs themselves may be interconnected in complex ways by means of derived streams. For example, jobs 2, 3, and 8 are connected.
Referring to FIG. 2A, a typical configurable distributed computer system 20A is shown. Clusters of processing nodes, represented as clusters 22A-22D, each include processing nodes (PNs) 23A-23D, respectively, that are interconnected by a network 24A. The interconnections are formed by links. The network 24A may be modified in terms of the connectivity between the clusters 22A-22D of nodes 23A-23D, into a network 24B, as illustrated in a distributed computer system 20B of FIG. 2B. For instance, the capacity of the links between cluster 22A to cluster 22B is changed from 40 Gbps (in network 24A) to 30 Gbps (in network 24B). Connections between clusters 22A-22D may also disappear (e.g., the capacity between cluster 22A and cluster 22C is 0 Gbps in network 24B).
Even at these sizes, streaming systems are expected to be essentially swamped at almost all times. Processors will be nearly fully utilized, the offered load (in terms of jobs) will far exceed the prodigious processing power capabilities of the systems, and the storage subsystems will be virtually full. Such goals make the design of future systems enormously challenging.
Focusing on the scheduling of work in such a streaming system, it is clear that an effective optimization method is needed to use the system properly.
A patent application entitled “METHOD AND APPARATUS FOR SCHEDULING WORK IN A STREAM-ORIENTED COMPUTER SYSTEM,” U.S. patent application Ser. No. 11/374,192, filed on Mar. 13, 2006, describes a scheduler for stream processing systems. This application is commonly assigned to the assignees of the instant application and is incorporated by reference herein in its entirety. The scheduler disclosed in U.S. patent application Ser. No. 11/374,192 is for use in static networks.
What is needed is way to resolve the aforementioned scheduling problems for stream processing systems with configurable networks.