Resource utilization/consumption is one of the critical characteristics of any computing task. This is especially the case for a grid computing environment. In general, a vast quantity of computing power is often wasted due to the under-utilization of resources. To date, planning and sizing for computing requirements has typically been based on peak demand. However, statistically speaking, the actual resource utilization is usually on the order of 60% for the IBM S/390 (zSeries) machines, and under 10% for current AIX and Intel machines. Harnessing the unutilized computing power can provide immediate economic benefits to any organization that has a large installed base of servers.
Grid infrastructure is defined as flexible, secure, coordinated resource sharing among a dynamic collection of individuals, institutions, and resources. It is distinguished from conventional distributed (enterprise) computing by its focus on large-scale resource sharing, innovative applications, and, in some cases, high-performance orientation. The collection of individual resources and institutions that contribute resources to a particular grid and/or use the resources in that grid, is referred to as a virtual organization and represents a new approach to computing and problem solving based on collaboration among multiple disciplines in computation and data-rich environments. To add a resource under the grid infrastructure, current resource utilization information is needed. This is an important attribute of the “Grid Resource Manager,” which allocates the resources within the grid based on the resource requirements of the application(s).
Another need for the approximation/estimation of computing resources is for installation of software packages. Specifically, before installing a software package or application on a given computer, a user needs to know what resources (e.g., memory, CPU, etc.) will be required. Another important question is how much time and computing resources are needed to run a given task on a given machine with some other tasks and services running in the background. Usually, to find answers to these questions, users turn to the software documentation, which may contain a list of resource requirements, and, sometimes, application performance data. The problem is that such documentation data is only valid for one particular hardware/software configuration, and can hardly be applied to any other configuration. In addition, the performance data is usually obtained in an experiment, when the given software task was running in parallel with other tasks and services. There is no easy way to estimate how much the performance data will change, if the machine loading changes.
In many computer environments such as a Grid infrastructure, multiple computer systems are typically interconnected in a cluster or the like. To date approximation of resource consumption is a task that is performed on each such computer system. Thus, the more computer systems there are provided, the more calculations that must be made. Heretofore, no system has suggested a way to consolidate the resource consumption calculations onto one computer system. Such a methodology would significantly streamline and consolidate the number of calculations.