Server farms have become increasingly common to provide vast amounts of computing resources. For example, server farms may be utilized to provide a wide variety of services, such as to store and retrieve data (e.g., a storage system), process financial data, route and store email, communicate instant messages, provide authentication services, output web pages, and so on. As the amount of computing resources desired in providing these services increases, the server farm may be “scaled out” by adding additional computers thereby providing a flexible topology in which additional resources may be added “as needed”.
Capturing transaction processing performance in such systems, however, makes determining aggregate quality of service (QoS) of the server farm difficult. Additionally, when QoS falls below targets, determining whether or not the cause of the service quality problem is related to a subset of the farm is difficult as well. Further, because collections of server farms (i.e., clusters) are used to service enormous transaction loads (e.g., in billions), capturing QoS data and being able to segregate the information is also difficult to accomplish.