The present invention relates to distributed computing, and more specifically, to monitoring behavior of components of a distributed application.
In a streams processing environment, multiple nodes in a computing cluster execute a distributed application. The distributed application retrieves a stream of input data from a variety of data sources and analyzes the stream. A stream is composed of data units called “tuples,” which is a list of values. Further, the distributed application includes processing elements that are distributed across the cluster nodes. Each processing element includes one or more operators configured to perform a specified task associated with a tuple. Each processing element receives one or more tuples as input and processes the tuples through the operators. Once performed, the processing element may output one or more resulting tuples to another processing element, which in turn performs a specified task on those tuples, and so on.
A developer for the distributed application may design an operator graph using an integrated design environment (IDE) tool. The operator graph specifies a desired configuration of processing elements and elements in the streams processing environment. The developer may define functions for each operator to perform via the operator graph. The functions can specify a given task to perform and a destination processing element for tuple output. Further, the IDE tool may provide a debugger that allows the developer to ensure that the distributed application executes in the streams processing environment as specified.