Identifying performance problems, in particular scalability bottlenecks, is difficult and often left to a small number of performance experts. To know whether a system runs smoothly requires a careful collation of disparate observations. Typically, a performance expert knows the signifiers of certain classes of problems: e.g., errors in log files, or excessive time spent in garbage collection or waiting on data sources. After collecting as much data as is feasible to dredge from the system, the expert proceeds with the tedious task of altering and combining, and of applying rules, to interpret what the raw data implies about the quality of performance that the system currently achieves. Performance is often suboptimal due to a superposition of unrelated problems. The expert casts a wide net of data collection, in order to identify these problems so that they can be prioritized. Once the largest problem has been fixed, the process iterates.
Few existing tools focus on identifying contended resources. However, those tools are not useful for identifying threads that are idle and unable to make progress. Rather, they generally focus on a particular class of problems, such as finding contended locks. That is, these tools focus on one point in the space of scalability analysis. Point tools can be effective once the class of a bottleneck is known. However, determining the class of a bottleneck, in itself, is a challenging step.