1. The Field of the Invention
The present invention relates to debugging distributed applications. More specifically, the present invention relates to systems, methods, and computer-program products for including debug controls along with distributed application data in messages that are utilized by distributed applications during normal operation.
2. Background and Relevant Art
Computer systems and related technology affect many aspects of society. Indeed, the computer system's ability to process information has transformed the way we live and work. Computer systems now commonly perform a host of tasks (e.g. information management, scheduling, and word processing) that prior to the advent of the computer system were typically performed manually. More recently, computer systems have been coupled to one another to form computer networks over which computer systems may transfer data electronically.
Initially, a significant portion of data transfer on computer networks was performed using specific applications (e.g. electronic mail applications) to transfer data files from one computer system to another. For example, a first user at a first networked computer system could electronically mail a data file to a second user at a second networked computer system. However, program execution (e.g. running the electronic mail application) and data access (e.g. attaching the data file to an electronic mail message) were essentially completely performed at a single computer system (e.g. at the first computer system). That is, a computer system would execute programs and access data from storage locations contained in the computer system. Thus, being coupled to a network would not inherently give one networked computer system the ability to access data files from another networked computer system. Only after a user actively sends a data file to a computer system could the computer system access the data file.
However more recently, as the availability of higher-speed networks has increased, many computer networks have shifted towards a distributed architecture. Such networks are frequently referred to as distributed systems. Distributed systems function to “distribute” program execution and data access across the modules of a number of different computer systems coupled to a network.
In a distributed system, modules connected to a common network interoperate and communicate between one another in a manner that may be transparent to a user. For example, a user of a client computer system may select an application program icon from a user-interface thereby causing an application program stored at a server computer system to execute. The user-interface may indicate to the user that the application program has executed, but the user may be unaware, and in fact may not care, that the application program was executed at the server computer system. The client computer system and the server computer system may communicate in the background to transfer the user's commands, program responses, and data between the client computer system and the server computer system.
Often, a distributed system includes a substantial number of client computer systems and server computer systems. In many cases, computer systems of a distributed system may function both as client computer systems and server computer systems, providing data and resources to some computer systems and receiving data and resources from other computer systems. Each computer system of a distributed system may include a different configuration of hardware and software modules. For example, computer systems may have different types and quantities of processors, different operating systems, different application programs, and different peripherals. Additionally, the communications path between computer systems of a distributed system may include a number of networking components, such as, for example, firewalls, routers, proxies and gateways. Each networking component may include one or more software or hardware modules that condition and/or format portions of data so as to make them accessible to other modules in the distributed system.
In some cases, “distributed applications” are specifically designed for execution in a distributed system. Due to the number of modules that may be included in a distributed system, properly designing and configuring distributed applications is significantly more complex than designing and configuring applications for execution at single computer system. Each portion of a distributed application, in addition to being configured for proper operation in a stand-alone mode, must also be configured to appropriately communicate with other portions of the distributed application, as well as other modules in associated distributed systems. As such, distributed applications often require “debugging” to help ensure desired operation. Debugging may be performed to find and remove defects (or “bugs”) that might cause data corruption or cause modules of the distributed system to crash.
One common debugging technique used to debug distributed applications is to physically attach a debugging console to a computer system that contains a portion of a distributed application requiring debugging. The debugging console interacts with the distributed application to help determine if input data to and output data from the distributed application are correct and, in the event that data is not correct, provides some indication of why the data is not correct. However, in some cases a user desiring to perform debugging operations may not have physical access to the computer system containing the portion of the distributed application that needs debugging and thus no debugging console can be attached. For example, a user at a client computer system in the United States may desire to debug a portion of a distributed application running on a server computer system in Japan, however the user may have no easy way to get to the server computer system.
Another approach to debugging distributed applications is to create a “remote” debugging session. In some cases, remote debugging sessions are supported by operating systems through the use of “remote shells,” which allow a user to create processes on remote computer systems. In other cases, specials programs may “attach” a debugger to a remote module through the use of a debugging agent and a specialized protocol. In either case, a user may attempt to debug a module of a computer system by accessing the computer system remotely. That is, a user physically located at a first computer system creates a session on a second computer system and is able to use the session to cause debugging commands to execute at the second computer system.
It may be that a distributed system is configured in a way that allows a user to easily create remote debugging sessions and access modules in remote computer systems. However many times, and perhaps more frequently, modules of a distributed system are protected by security mechanisms, such as, for example, firewalls that block some types of communication. That is, security mechanisms may be configured so that communications between portions of a distributed application are allowed, but other communications that may be seen as a security risk are blocked by the security mechanisms. Since debugging operations may interact with modules in ways that could be destructive, security mechanisms frequently interpret requests for remote debugging sessions as potentially harmful communications and thus block the communications.
Even if a debugging console is attached or a remote debugging session is established to a computer system, current debugging approaches are primarily directed at debugging code contained in the computer system. That is, these approaches may be used to debug operations that occur at the computer system. However, after a distributed application places data in a message for transport to another module, a debugging console or remote debugging session has no way to determine what happens to the data.
Further, current debugging approaches offer little control over debug functions that are performed and the amount of data that is returned when debugging a distributed application. Some approaches have standardized debugging operations with limited ability to configure the operations for specific distributed systems. This may result in too much data, some of which may not even be useful for debugging a particular distributed application, being returned during a debugging session. Lack of control over the amount of data that is returned may result in a “probe effect,” where the amount of data returned is so great that performance of a distributed system is impacted.
Therefore, what are desired are systems, methods, and computer program products, for more efficiently and accurately debugging distributed applications.