The increase in scale and complexity of information processing in modern distributed systems has raised many challenging problems. Examples of such distributed processing systems include systems processing complex business process workflows, information/data stream processing systems, management and provisioning systems. Such systems may be expected to handle a large number of processing requests, and hence understanding the scalability issues in systems of this magnitude becomes difficult. While a known methodology makes use of end-to-end measurements in estimating the decomposition of end-to-end delay to different nodes involved in the end-to-end flow, such method is restricted to understanding application level scalability and may only be able to identify bottlenecks at high levels, e.g., at a node level.