In large distributed systems, user application service requests often pass through many service modules before being executed. For example, in Alibaba's cloud computing platform, a single operation request by a user passes through scheduling, communications, indexing, distributed storage, and other service modules before the completion of many operations such as updating the index buffer, maintaining meta-information, writing files, and writing access logs. These service modules typically are deployed on different processes on hundreds of servers and are constructed from different software programs.
Currently, monitoring and analyzing user-invoked distributed system behaviors are often concentrated on a single service component of a distributed system, such as, for example, monitoring, reading, and writing of a file system or monitoring the throughput of an upper-level system. Such a monitoring approach only analyzes a single service module and is unable to accurately obtain an effect of a user's application service request on the overall distributed system.