Today's large scale online services such as Web search services and e-commerce services often include many servers distributed among various locations at data centers. These servers may receive and fulfill hundreds of thousands of requests from users each day. A typical large scale service has a multi-tier architecture to achieve performance isolation and facilitate systems management. Within each tier, servers may be replicated and/or partitioned to enable scalability, load balance, and enhance reliability.
Because of the complexity of these large scale online services, planners often find it difficult to predict service performance when the large scale services experience a reconfiguration, disruption, or other changes. Planners currently lack adequate tools to identify and measure service performance, which may be used to make strategic decisions about the services.