Information Technology (IT) is responsible for delivering application services using an increasingly complex multi-tier production environment with a heterogeneous application mix. IT operations are struggling to meet required service levels in performance and availability, while being pressed to increase efficiency and resource utilization. Consolidation of IT resources, together with business concerns, exacerbates this effect, stretching the capability of IT operations to meet ever changing demands for computing resources. Traditional approaches and tools for performance and availability management are variations of the never-ending “monitor-tune-fix” cycle, which involves identifying that a problem exists (i.e., monitoring), increasing overall throughput to overcome the problem (i.e., tuning), and performing root-cause analysis to uncover the precise cause of each specific instance of a problem (i.e., fixing). Such approaches are unable to cope with the complexity and variability of the rapidly changing IT environment.
Reference is now made to FIG. 1, which is a schematic illustration of a multi-tier computing environment, generally referenced 50, which is known in the art. Computing environment 50 includes a first client 62 running a first application, a second client 64 running a second application, a first tier 52, a second tier 54, a third tier 56, a fourth tier 58, and a fifth tier 60. The first tier 52 is a web server. The second tier 54 is an application server, application server A. The third tier 56 is another application server, application server B. The fourth tier 58 is a further application server, application server C. The fifth tier 60 is a database. First tier 52 is coupled with first client 62, with second client 64, and with second tier 54. Second tier 54 is further coupled with third tier 56 and with fourth tier 58. Third tier 56 is further coupled with fourth tier 58 and with fifth tier 60. Fourth tier 58 is further coupled with fifth tier 60.
A “tier” represents a certain type of processing that is part of the overall delivery of an IT service (e.g., presentation level processing on a web server tier or data processing on a database tier). Each tier typically runs on a different host machine.
The first application initiates a user request R1 and sends user request R1 to first tier 52. User request R1 is part of an overall transaction initiated by the user. User request R1 may be, for example, a web based query to retrieve information from a certain application. User request R1 may require the services of different tiers in computing environment 50 and may generate additional requests in order to obtain these services. The tier that receives a request either replies to the tier that sent the request, or sends a new request to a different tier. Eventually a reply is returned in response to the original user request R1. A given tier can only request a service from another tier in computing environment 50 if the two tiers are directly coupled with each other.
Overall management of distributed computing environment 50 requires knowledge of how each tier handles its workload. For example, given a shortage of resources on one tier, a system administrator may scale this tier by creating clones of the tier, both vertically (i.e., within the same host machine) and horizontally (i.e., across multiple host machines). For example, in computing environment 50, the system administrator may add an additional application server A2 (not shown) to second tier 54 application server A, wherein application server A2 is a clone of application server A. By the same token, if an overabundance of resources exists on a tier, the system administrator may transfer free resources to another tier which has a shortage of resources. The system administrator may further configure a certain tier in order to improve the overall performance or indicate modifications to optimize the application running on the tier. This is an example of tier specific application monitoring for performance management. It is noted that a request might reach only certain tiers in computing environment 50. Furthermore, the same request might reach certain tiers using multiple paths. For example, in computing environment 50, a request may reach fifth tier 60 database via either third tier 56 application server B, or via fourth tier 58 application server C. As the request paths are not consistent across the entire environment, solving the resource shortage on one tier does not necessarily guarantee the performance of the overall application, which may span multiple tiers. A processing bottleneck in any tier will delay all application functions that depend on that tier.
First tier 52 receives user request R1. First tier 52 allocates processing enclave X1 to process user request R1. While processing user request R1, the application logic executing in processing enclave X1 determines it cannot complete processing user request R1 without additional information or operations to be provided by second tier 54. First tier 52 then sends a subsequent request R2 to second tier 54, requesting the additional information or operations. Second tier 54 allocates processing enclave X2 to process request R2. The application logic executing in processing enclave X2 determines that request R2 requires further information or operations to be provided by fourth tier 58. Second tier 54 then sends a subsequent request R3 to fourth tier 58. Fourth tier 58 allocates processing enclave X4 to process request R3.
Processing enclave X4 completes execution. Fourth tier 58 returns a reply R3′ to second tier 54, in response to earlier request R3 of second tier 54. Processing enclave X2 receives reply R3′ and resumes processing. Once processing enclave X2 has completed execution, second tier 54 returns a reply R2′ to first tier 52, in response to earlier request R2 of first tier 52. Processing enclave X1 receives reply R2′ and resumes processing. Once processing enclave X1 has completed execution, first tier returns a reply R1′ to user request R1, whose service has now been completed.
In computing environment 50, each of the different tiers is isolated from the tiers which are not directly coupled therewith. For example, request R3 from second tier 54 to fourth tier 58, directly coupled therewith, does not necessarily include information relating to a former request R2, which was received in second tier 54 from first tier 52, nor does request R3 include information relating to user request R1. A given tier has no way of obtaining certain information related to the request being processed at that tier, such as which user initiated the transaction, which requests preceded the request which is being processed at the given tier, or characteristics of requests which preceded that request. For example, second tier 54 cannot identify characteristics of request R2, such as whether the request was preceded by user request R1 sent to first tier 52, or that the transaction originated at user request R1 from first application 62. As a result, if a priority level is assigned to a processing enclave processing a request within a certain tier, that priority level is assigned taking into account only the minimal information which is available on the tier. This information includes the request characteristics (e.g., the tier login credentials used by the request) and perhaps information about the processing enclave processing that request (e.g., the database session identification). Requests are generally processed on an equal priority basis (e.g., first-come-first-serve), though mechanisms operating to differentiate priority levels are available locally on a given tier. Performance management must be done on an individual tier basis, as the other tiers in computing environment 50 cannot be accounted for when dealing with a specific tier. Typically, a system administrator who is responsible for managing a multi-tier computing environment such as computing environment 50 attempts to improve performance by adjusting the resource allocation for a given tier.
U.S. Pat. No. 5,958,010 to Agarwal et al. entitled “Systems and methods for monitoring distributed applications including an interface running in an operating system kernel”, is directed to systems and methods for monitoring enterprise wide operation of a distributed computing system to develop business transaction level management data for system performance, usage trends, security auditing, capacity planning, and exceptions. A system having a distributed computing architecture includes multiple workstations, servers, and network devices. Each workstation is representative of a computer system coupled to a network. Each workstation is capable of requesting service from any of the servers. Each workstation has a communication stack for exchanging data with the network. The system further includes a plurality of monitoring agents, and a console module with a database connected therewith. Each monitoring agent has an external event interface that provides event information about various components of an enterprise. Each of the monitoring agents is associated with a respective one of the workstations or servers.
The monitoring agent may physically reside on the associated client or server thereof. The monitoring agent monitors and collects data being exchanged between a client and the network, and between a server and the network. Each monitoring agent can be a software module, a hardware device, or a combination thereof. Each monitoring agent passes information representative of the collected data to the console module. The console module stores this information within the database for analysis by an operator. An application program running on the console module can view the collected data to show system performance of any process or component of the enterprise. A system administrator can develop enterprise level usage statistics and response times, develop charts and reports, and perform other relevant data analysis for determining user-defined statistics relevant to the operation of the enterprise.
U.S. Pat. No. 6,108,700 to Maccabee et al entitled “Application end-to-end response time measurement and decomposition”, is directed to a method and system for measuring and reporting availability and performance of end-to-end business transactions. The system operates on a client-server application architecture. The system includes three logical components: Event Generation, Transaction Generation, and Report Generation, as well as overall system management via System Administration.
The Event Generation component exists on every computer being measured in the architecture. Each computer has one Agent, a plurality of Sensors and a plurality of Processors. The Sensors interact with platform components on which business applications run, monitor application activities, and detect changes of state. When appropriate, each of the Sensors generates an event that describes the change in state, when and where the event occurred, and any extra data necessary to uniquely identify the event. An event contains a time-stamp and correlation data used later by the system to associate the event with other events into transactions. The Sensors forward the generated events to their respective Agents. The Agents temporarily store the data and may distribute the data to other system components having registered interest in the event. A Processor analyzes the events and further deduces changes in state. The changes in state may be directly related to actions occurring within the business transaction platform components or derived by combining previously generated events from Sensors or other Processors to describe states achieved. The Processors forward the generated events to their respective Agents.
The Transaction Generation component typically exists in one of the computers in the network and includes a Director. The Director receives events from the Agents under control thereof. The events are examined, and correlated and collated into transactions based on transaction generation rules. The System Administrator determines which transactions to generate.
The Report Generation component includes a Manager. The Manager collects the transactions from the Directors. The collected transactions are manipulated to obtain information relating to the availability and performance of business transactions. A report or continuous graphic monitoring can be produced upon a specific or periodic request from a Graphical User Interface (GUI). Report Generation includes definition of the initial selection and processing of transactions, as well as the sorting and aggregation methods used to consolidate the transactions event data into availability and performance information.
US Patent Application No. 2002/0129137 A1 to Mills et al. entitled “Method and system for embedding correlated performance measurements for distributed application performance decomposition”, is directed to techniques for embedding correlated performance measurements in transactions associated with a distributed application. The techniques are used in accordance with application performance decomposition. Data is embedded in a communications protocol used to carry a transaction between application components in a distributed computing network, rather than altering the actual transaction data itself. The embedded data may include a timestamp and duration measurement data. The format of the embedded data combines a well-defined keyword prefix with a variable suffix that identifies the timing source, followed by a colon delimiter and whitespace, and followed by the time stamp and duration information.
Subsequent processing stages of the distributed application can interpret the communications protocol to glean processing durations of previous stages, in order to make decisions regarding treatment of the transaction. The measurement information is embedded within the same distributed application described by the measurement information, so that completion of the transaction occurs simultaneous or contemporaneous with availability of knowledge of the transaction performance characteristics.
A possible communications protocol is the HyperText Transport Protocol (HTTP). A possible distributed computing network is the World Wide Web (WWW). The application components may be a client application running on a client and a server application running on an application server. For example, the client application is a web browser, and the server application runs on a web server. An application transaction is the client application requesting content from the application server and the application server responding. Performance information is generated to measure the round trip response time from the perspective of the client application, as well as to decompose the response time into the time taken by the server application to service the request and generate a reply. In particular, lines are added to the HTTP headers to carry performance measurement data, allowing the client to receive the server measurement duration in the HTTP Reply header.