With the advent of networks such as the Internet, distributed computing has become an increasingly popular computing approach as it allows for the sharing of computational resources (e.g., memory, processing time, input/output, etc.) among many different users or systems. One such example is “grid computing” (or the use of a computational grid), which involves applying the resources of many computers in a network to a single problem at the same time—usually to a scientific or technical problem that requires a great number of computer processing cycles or access to large amounts of data. In other cases, resources are shared to perform relatively disparate functions. For instance, because of the different time zones involved, remotely located divisions of a corporation might share a high end server capable of performing complex computations. Regardless of the environment, the demand for the available resources in the distributed environment must be carefully allocated.
Moreover, with the advent of new distributed computing technologies, such as IBM's POWER5™ processor, “simultaneous multithreading” is provided in which multiple threads of execution can execute on the same processor at the same time. Using this particular processor, the operating system's task dispatcher “sees” four available processors onto which a task may be dispatched, namely, two physical processors, then because of the multithreading capabilities, two logical processors per physical processor. Given these capabilities, there again exists a need for a robust solution that is capable of optimally allocated resources in a dynamic fashion.
Existing solutions for allocating resources in a distributed computing environment involve: (1) domain knowledge/exert based systems; (2) statistics based systems; and (3) machine learning models, such as neural networks. Unfortunately, each of these approaches has drawbacks. For instance, domain knowledge based or expert systems based tools and methodologies are only as good as the experts themselves. That introduces an element of subjectivity and non-standardization. This can result in variances in performance and a lack of rigor that can lead to sub-optimality of the resource allocation.
Statistics based systems begin with a statistics model, e.g., regression analysis using linear equation, or a cubic spline. Again, this has some inherent assumptions about the data dynamics, which is driven by the expertise of the end user rather than the proactive data exploration. This may lead to sub-optimal system configuration and parameter settings hampering the performance of the resultant configuration.
Machine learning model based approaches, such as neural networks, are uselful tools that can deal with a lot of data at the same time, are scalable and fast learners. However, the interpretation of the neural networks is not so easy for a user, and hence this method is not suitable for verification of “rules” from a validation standpoint.
There are other methods such as fuzzy rule-based modeling, which come close to exploiting non-linearity in the system, however the rule base is a fuzzy logic based representation of expert knowledge that renders itself vulnerable to the same type of issues and drawbacks such as those described above.
Accordingly, a need exists for a robust solution for allocating resources in a distributed computing environment that is not subject to the limitations described above.