Currently, data-intensive services have been widely applied, and typical data-intensive services include financial services, network monitoring, telecommunications data management, Web applications, and the like. In a data-intensive service, data is characterized by a large volume and a high speed, and is time varying. Therefore, it is not suitable to use a durable and stable relationship model to perform data modeling, but suitable to use a transient data stream model to perform data modeling, and therefore research on data stream computing emerges. Data stream computing is a pipeline-like data processing mode. Data stream computing comes from a concept that data value decreases as time elapses. Therefore, after an event triggers generation of data, the data needs to be processed as soon as possible. It is optimal that data is processed instantly as soon as the data is generated, that is, data processing is performed once instantly as soon as one event occurs, instead of buffering data for batch processing.
In a stream computing system, data stream computing is performed based on a streaming data processing model. As shown in FIG. 1, service data processing logic generally needs to be converted into a data processing mode shown in a directed acyclic graph (DAG; or referred to as a flow graph), an operator (Operator) in the graph bears a data processing operation, a data stream (stream) represents data transmission between Operators, and all Operators may be executed in a distributed mode.
In the prior art, a solution for setting a streaming data processing model for data stream computing is that: physical equipment (PE, or referred to as an execution unit) and logical units (generally marked as an Operator in a DAG graph, or referred to as a working node) are in a multiple-to-one relationship. Static configuration of a parallelism degree of an Operator is supported in this solution. That is, according to a parallelism degree that is of an Operator and statically configured by a user, each Operator invokes, in a service execution process, a corresponding quantity of execution units according to the parallelism degree, so as to process a data stream generated by a service.
Because a stream computing system generally is a distributed real-time stream processing system, processing conditions of tasks in the system change in real time. For a real-time changing condition, a parallelism degree initially set by a user is not optimal in many cases, and therefore, a streaming data processing model generated according to the parallelism degree initially set by the user cannot adapt to a real-time change of the system, thereby causing a waste of resources in the stream computing system and greatly limiting a data processing capability of the stream computing system.