Organizations' use of information technology (IT) and infrastructure computing resources are moving away from a static environment to a more dynamic and fluid computing environment. Traditionally, organizations' computing resources existed on fixed infrastructure owned by the organization and controlled directly by the organization. However, with the virtualization of computing resources, and shared computing environments (e.g., cloud computing), a computing resource consumer's application and computing service requests may reside on and use a variety of dynamic virtual systems and resources, and use any number of service providers to meet the users service-level agreements (SLAs).
Traditionally, the application owner also owned the computing infrastructure so that the same entity managed and maintained the data center. The data center assigns the consumer's application to a particular set of computing resources (e.g., particular computing clusters) in a physical data center. Even when the required number of nodes scales, particular nodes assigned at any time come from one to the allocated set of nodes. In a virtualized, cloud computing environment, the user can scale the user's resource utilization across multiple computing environments and service providers, no longer tied to a fixed number of nodes in a particular cluster or particular data center.
Infrastructure as a service and platform as a service, as provided by cloud computing service providers, provides a user a set of resources, similar to set up virtual machines of different computing sizes, capacity and throughput rates. For example, a small instance may be configured with limited processing resources, and a large instance would have relatively or processing resource capabilities. Traditionally, where the user also owned the computing environment, the user had direct native access to resource utilization and performance information, and access to all the monitoring metrics and logging information output from the user's computing environment. In contrast to service providers today, scientific research computing environments, for example where a national organization may host the computing environment for researchers, the researchers may be provided direct access to performance information regarding the physical infrastructure that may affect the researcher's computing utilization.
In a shared computing environment, a user's application is decoupled from the infrastructure environment so that the user may no longer have native visibility into the infrastructure to monitor and control performance of the application. Because users now have the ability to decouple the user's application from the native computing environment (infrastructure) and deploy the application in a dynamic virtual cloud computing environment, users no longer have native visibility into the state of the cloud computing environment provided by the third-party service providers. The cloud computing service provider (e.g., Amazon cloud watch) may provide hooks to provide passive instrumentation or views into the computing environment so that the user may monitor metrics regarding the computing resources used by the user's application (e.g., virtual machines, CPU usage, memory usage, the number of reads and writes performed for an application by the user's assigned virtual machine). However, although the user's virtual resources coexists with any number of other virtual resources used by other users on the same physical infrastructure (e.g., multi-tenant, multi-class users), the service provider does not provide the user the ability to realize the actual state of the computing environment. For example, virtual machines on the same physical cluster of servers impact each other as they consume shared resources like CPU, memory, network, and disk, but a virtual machine cannot directly view the use of other virtual machines. Also, as another example the read and write access of storage volumes on the same physical disk impact each other, and communication streams sharing the same network. The user in the shared environment sees only the activity of user's assigned virtual resources, but not the environment in total in order to understand how the consumption of other virtual resources and processes running in the shared environment are impacting the user specifically. When the user observes performance degradation that the user cannot account for given the passive monitoring provided by the service provider, the user has no way to understand the actual impact of other users coexisting in the cloud computing environment affecting the user.