1. The Field of the Invention
The present invention relates to electronic messaging technology; and more specifically, to mechanisms for integrating messaging diagnostics into a messaging pipeline.
2. Background and Related Art
Computer systems and related technology affect many aspects of society. Indeed, the computer system's ability to process information has transformed the way we live and work. Computer systems now commonly perform a host of tasks (e.g. information management, scheduling, and word processing) that prior to the advent of the computer system were typically performed manually. More recently, computer systems have been coupled to one another to form computer networks over which computer systems may transfer data electronically.
Initially, a significant portion of data transfer on computer networks was performed using specific applications (e.g., electronic mail applications) to transfer data files from one computer system to another computer. For example, a first user at a first networked computer system could electronically mail a word processing document to a second user at a second networked computer system. However, program execution (e.g., running the electronic mail application) and data access (e.g., attaching the word processing document to an electronic mail message) were essentially completely performed at single computer system (e.g., the first computer system). That is, a computer system would execute programs and access data from storage locations contained within the computer system. Thus, being coupled to a network would not inherently give one networked computer system the ability to access data from another networked computer system. Only after a user actively sends data to a computer system could the computer system access the data.
However more recently, as the availability of higher-speed networks has increased, many computer networks have shifted towards a distributed architecture. Such networks are frequently referred to as distributed systems. Distributed systems function to “distribute” program execution and data access across the modules of a number of different computer systems coupled to a network.
In a distributed system, modules connected to a common network interoperate and communicate between one another (e.g., exchanging electronic messages) in a manner that may be transparent to a user. For example, a user of a client computer system may select an application program icon from a user-interface thereby causing an application program stored at a server computer system to execute. The user-interface may indicate to the user that the application program has executed, but the user may be unaware, and in fact may not care, that the application program was executed at the server computer system. The client computer system and the server computer system may exchange electronic messages in the background to transfer the user's commands, program responses, and data between the client computer system and the server computer system.
Often, a distributed system includes a substantial number of client computer systems and server computer systems. In many cases, computer systems of a distributed system may function both as client computer systems and server computer systems, providing data and resources to some computer systems and receiving data and resources form other computer systems. Each computer system of a distributed system may include a different configuration of hardware and software modules. For example, computer systems may have different types and quantities of processors, different operating systems, different application programs, and different peripherals. Additionally, the communications path between computer systems of a distributed system may include a number of networking components, such as, for example, firewalls, routers, proxies and gateways, and communication paths can change from time to time.
In some environments, “distributed applications”, such as, for example, Web services applications, are specifically designed for execution in a distributed system (e.g., the Internet). Distributed applications can include hundreds or thousands of modules and each module can be compiled from thousands or even millions of lines of source code. Further, each module of a distributed application must be design to appropriately communicate with other modules of the distributed application, as well as other modules in associated distributed systems. For example, interoperation of different modules of a distributed application can require exchanging electronic messages (e.g., Simple Object Access Protocol (“SOAP”) envelopes) according to specified security and policy requirements. Thus, the design and configuration of distributed applications is significantly more complex than for stand-alone applications.
Due at least in part to this complexity, communication between portions of distributed applications (even those that are properly configured) may operate in an undesirable manner from time to time. As such, it is often desirable to perform diagnostic operations (e.g., testing, debugging, profiling, and tracing) on electronic messages exchanged between modules of a distributed application.
For example, some diagnostic techniques used on distributed applications is to attach, or “glue on,” a separate third-party diagnostic process to distributed application modules. As electronic messages are exchanged with the module, the third-party diagnostic process records diagnostic data to a log file. In some cases, third-party diagnostic processes are attached to a number of different distributed application modules and each third-party diagnostic process records data to a separate log file. The separate log files are then combined and correlated to give some indication of what may be causing undesirable communication between portions of a distributed application.
Unfortunately, attaching separate third-party diagnostic processes to distributed application modules can be time consuming and can require that the corresponding distributed application be brought down and redirect the message traffic. Since distributed applications have modules at a plurality of different computer systems, properly bringing down a distributed application may require coordination between the administrators of the different computer systems and redirection of message traffic may involve additional coordination. Bringing down and re-directing a distributed application to attach third-party diagnostic processes can also result in users of the distributed application losing access to information and revenue.
Further, third-party diagnostic processes are often designed based on a one size fits all approach. That is third-party diagnostic processes may have standardized diagnostic operations with limited ability to configure the diagnostic operations for specific distributed applications. However, the complexity of individual distributed applications makes it difficult, if not impossible, to design third-party diagnostic processes to include all the possible diagnostic operations that could be performed. Thus, while third-party diagnostic processes may be sufficient for performing basic diagnostic operations, third-party diagnostic processes often lack functionality for more complex diagnostic operations (e.g., timing interactions and interdependencies).
Thus, distributed application designers can include specialized diagnostic code within distributed applications to implement more complex diagnostic operations. Specialized code can cause a distributed application to report information from different modules of the distributed application to a centrally located diagnostic module. Thus, diagnostic module is better positioned to determine what is causing undesirable behavior. However, the use of specialized diagnostic code has at least one inherent problem: specialized diagnostic code is often self-contained and will not interact with other diagnostic processes. Due to these incompatibilities, specialized diagnostic code must be individually developed for different distributed applications. This is time consuming and may require substantially technical expertise on the part of a programmer.
Further, typical diagnostic techniques offer little control over the type of diagnostic functions that are performed and the amount and type of data that is gathered. For example, some diagnostic processes (e.g., NetMon) add a message redirector between computer systems that are exchanging electronic messages. To implement diagnostic operations for an electronic message, the message redirector receives an electronic message that originated at a sender, access at least a portion of the contents (e.g., headers and bodies) of the electronic message, performs a diagnostic operation based on the accessed contents, and forwards the electronic message towards the destination.
In some environments, diagnostic operations need to be performed on secure data, such as, for example, data that is encrypted and/or digitally signed. Thus, a message redirector may be provided with security information (e.g., keys) for accessing electronic message contents. For example, to implement a diagnostic operation for a malfunctioning sales application, a message redirector may be provided keys for validating a digital signature and decrypting personal and financial data contained in electronic messages. Thus, at least for the time needed to provide the personal and financial data to a diagnostic module, the personal and financial data is available in an insecure (unencrypted) format. Depending on the implemented diagnostic operation, the message redirector may also log portions of the accessed data making them further available.
Since many distributed systems, such as, for example, the Internet, are shared by a large number of entities, manipulating sensitive data at an intermediate computer system poses a security risk. For example, a malicious user could attempt to compromise a message redirector and access exposed data. Alternately, a malicious user could design a program that impersonates a legitimate message redirector. The malicious user could run the program in an attempt to have distributed application modules transfer sensitive data and corresponding security information to the program.
Accordingly, what would be advantageous are mechanisms for securely and efficiently performing diagnostic operations for electronic messages.