1. Field of the Invention
This invention relates to the field of computer software and, more particularly to debugging applications in a grid environment.
2. Description of the Related Art
A grid computing environment is a distributed computing environment where computing, application, storage, and/or network resources can be shared across geographically disperse organizations. An ideal grid computing environment allows flexible, secure, coordinated resource sharing among dynamic collections of individuals, organizations, and resources. In the grid environment, a variety of computing resources that contribute to a virtual resource pool can be transparently utilized on an as-needed basis. Grid computing resources in the virtual resource pool can be treated as commodities or services, which can be consumed in a manner similar to the commercial consumption of electricity and water.
While grid computing may presently be at an early stage in its evolution, several grid computing environments have been successfully implemented. One noteworthy implementation is the NC BioGrid Project that was successfully implemented in the fall of 2001 to enable researchers and educators throughout North Carolina to pool computing resources for use in sequencing genes and related genetic research. Other notable grid implementations include SETI@home, the Drug Design and Optimization Lab (D2OL), and EUROGRID. Additionally, commercially available software products exist for establishing a customizable grid computing environment, such as Avaki's data grid from Avaki of Burlington, Me. and Grid MP Enterprise from United Devices of Austin, Tex. Further, a number of readily available toolkits and standards have been developed for creating a grid computing environment including, for example, the Globus Toolkit provided by the Globus project and the Open Grid Services Architecture (OGSA).
A grid computing environment can include multiple application domains. Each application domain can include a set of computing resources that perform a series of related tasks. Examples of application domains include, but are not limited to, word processors, database programs, Web browsers, development tools, drawing applications, image editing programs, and communication programs. The various computing resources of one application domain can be distributed across several different grids within a grid computing environment, where each grid can contain a myriad of diverse hardware components, such as communication lines, networking routers, servers, workstations, peripherals, intranets, and the like.
Conventional techniques for debugging an application domain that spans across multiple locations of a grid environment include a number of shortcomings. For example, a test version of an application domain can be installed within a non-distributed test computer designed so that the application domain can execute within a single computing space. In some instances, various aspects of the test computer can be constructed to emulate various aspects of a grid computing environment. Conventional debuggers can then be utilized to analyze the application domain. Unfortunately, the complex intermeshing of applications, users, and processes that exist within the grid environment cannot be accurately emulated within the test computer. Accordingly, the test computer abstracts away the very complications of the grid environment that make specialized debugging necessary.
Generally, off-line debugging tools are of dubious value in troubleshooting problems occurring within a production grid-environment. That is, off-line debugging tools cannot predict the side effects that the sharing of various computing resources will have upon the application domains participating in the sharing. For example, two applications, each individually stable, can be commonly deployed within a grid environment. As a result of interactions, both applications can experience problems. Alternatively, only one of the applications can operationally exhibit problems, yet the problems can result from flaws within the other, operationally functional, application. Problems and complexities increase as the number of applications sharing grid-resources increase.
One attempted solution that can function within an operational environment involves incorporating special debugging messages within the source code of application programs deployed into a grid environment. These debugging messages, however, can add significant overhead to each application program, slowing down operational performance. Moreover, including effective debugging messages within each suitable code segment of an application domain can substantially increase the development time necessary for the application domain.
Additionally, for debugging messages to have value, application domain developers have to proactively predict the types of debugging messages that are necessary to troubleshoot application domains. Such predictions can be nearly impossible for software deployed in a dynamic environment where hardware components and computing resources are constantly integrated and legacy resources modified or removed. Even if these challenges are successfully met for a particular application, other application domains within a grid environment may not be so carefully constructed, which can be problematic since inconsistencies in debugging among applications can prevent the reporting and troubleshooting of inter-application conflicts. Consequently, including specialized debugging messages within the source code of grid-based applications does not resolve the shortcomings of grid based debugging.