1. Field of the Invention
The present invention relates to measuring and analyzing performance characteristics for accessing hyper-link documents, such as web pages, over a communications network. More specifically, the invention relates to those characteristics that are viewed at a client system that give insight to application efficiency and to web page document design and organization.
2. Present State of the Art
Web transactions comprise one or more requests for receipt of, or update to content and the associated responses to these requests. Ideally, there would be no overhead associated with performing these transactions. However, in practice, many types of overhead are needed to form and issue the request, and to form and issue the corresponding response. Examples include application protocol overheads needed to encapsulate application data (e.g., HTTP and/or FTP protocols), security overheads to encrypt and/or protect application data (e.g., Secure Sockets Layer and/or Socks protocols), network services overheads to assist with addressing application data (e.g., Domain Name Services), network routing overheads to move application requests to, and responses from the service provider (e.g., TCP/IP and/or UDP/IP protocols). Additionally, the organization of the application data may result in overheads, causing excessive request/response transactions to be issued to accomplish an application transaction (e.g., loading a page in a browser).
With the explosion of traffic on the Internet due to the World Wide Web (WWW) and ever increasing numbers of users, performance issues relating to the access of a particular document or web page have taken on increased importance. One example of such a performance issues is the round trip time of delivery of a particular web page from the moment of user request to final rendering by an application, such as a browser.
As mentioned previously, excessive response time can be due to many different factors, such as network traffic or delay, delays at the server, loads at the client system due to multiple requests, etc. Many of these problems are beyond the control of the person accessing the web page or the person designing and organizing the web page. There are a number of existing tools that may assist in measuring and resolving network problems.
Other problems are a direct result of the web page design and organization. Since a single requested web page can contain references to one or more components such as HTML documents, images, applets, and other information (any of which may result in generation of multiple requests to retrieve these components), many operations can occur between the requesting application on the client system that receives and renders the web page, and the server that responds to the request and xe2x80x9cserves upxe2x80x9d the requested page components. For example, a web page""s HTML document may reference many images that need to be retrieved in order to fully populate and render the complete web page.
Web page performance problems due to poor web page design or organization exhibit themselves best by monitoring at the client because all of the activities that can affect performance are taken into account, including initiating and generating the web page request, sending the request(s) for the web page components, serving these requests, delivering the responses, and finally assembling and rendering the web page. Furthermore, using a client system perspective is important for improving both client application design and web page design and organization. Therefore a web page designer can take performance measurements at the client system of different variations of page design in order to select the design with the most optimal performance relative to the client application being used. Since many elements of web page access can occur in parallel, these performance measurements also give an indication of the client system efficiency in accessing the page and the application, such as a browser, in scheduling the various tasks necessary to request, retrieve and render the web page.
Tools exist that monitor and generate the xe2x80x9ceventsxe2x80x9d associated with web page access and retrieval by a client system, such as opening a socket connection, sending an HTTP Get Request, receiving an HTTP Get Request Reply, etc., and compose these events into context rich timelines and other xe2x80x9cactivities,xe2x80x9d such as delivery time, amount of data delivered, idle time servicing the socket, amount of overhead data, etc. One way of monitoring relevant events associated with web documents is disclosed in a U.S. patent application entitled xe2x80x9cApplication End-to-End Response Time Measurement and Decompositionxe2x80x9d referenced heretofore and incorporated by reference in its entirety.
Because of the concurrent nature of web page access (i.e., activities may be performed in parallel), it is useful to group the activities in logical associations (as another activity). In this manner, for example, all activities relating to a particular GIF image access (i.e., socket connection time, server response time, actual GIF content delivery time, amount of data delivered, overhead data used, etc.) can be grouped, viewed and analyzed together despite the fact that there can be significant overlap with other logical associations, such as other image retrievals or server name resolutions.
While relatively simple measurements are known, such as the amount of data transmitted or xe2x80x9cratesxe2x80x9d such as the amount of data per unit time, there exists a need for more sophisticated benchmarking measurements in order to evaluate web application performance, web page design, etc. To be maximally useful, the end-product metric must be easy to assess or understand regardless of how complex the processing taken to arrive at the metric or the intricacies and relationships represented by the metric. Such performance metrics are extremely useful in that they can allow easy validation of web page design based on historical data, and they can provide objective means to compare and contrast web application performance.
One aspect of the present invention is to provide easy to use performance metrics that relate two or more activities associated with web page component access and retrieval or that relate two or more web pages or web transactions. Another aspect of the present invention provides a metric that represents the efficiency of application data transfer vis a vis the protocol overhead of setting up and making the transfer.
Yet another aspect of the present invention provides a metric that represents the efficiency of the application in concurrently processing the different items making up a web page or other web transaction.
Further aspects of the present invention provide metrics that represent how heavily weighted a particular web page may be with images, the cost to negotiate a secure connection, and the opportunities that may exist for improved processing of a web page or other web transaction by an application, such as a browser.
Additional objects and advantages of the present invention will be realized from the following description that follows by those skilled in the art or may be learned by practicing the invention. The objects and advantages of the present invention may be obtained by the ways shown and as particularly pointed out in the appended claims.
To achieve the foregoing objects, and in accordance with the invention as embodied and broadly described herein, a method, computer program product, and system for deriving web transaction performance metrics is provided.
The present invention comprises the method of relating characteristics gleaned by monitoring application transaction flows (and the decomposition thereof) to produce metrics useful to characterize the efficiency and performance of the application. These metrics can assist application designers and developers in reorganizing their application content, programs, and transports to provide improved service to their consumer.
The present invention takes advantage of existing technologies capable of monitoring web applications that can distinguish between the application payload (e.g., the desired) and the associated overheads required to request and retrieve this payload. This is done by monitoring actions happening on the system and generating events to represent these system states (e.g., the start and end of a web page transaction) and composing activities that have relevant meaning based on relating these events (e.g., duration of the transaction, amount of data transmitted).
The invention then provides a series of relationships of these various overheads to the payload, resulting in objective, quantifiable metrics able to be used for comparison and measurement purposes. This is done by selecting important activities to compose and then devising relationships between these activities so that a numeric metric may be derived.
For example, the amount of overhead data, such as protocol data for the SOCKS or SSL protocols, or header data for an HTTP Get Request and/or HTTP Reply, or transmission and routing headers for TCP/IP packets, could be monitored as one or more activities representing overhead. The application data, such as the URL, a cookie or other text sent to the server, and the returned web page content (e.g., HTML or GIF data) itself could be accumulated in activities representing xe2x80x9cpayload.xe2x80x9d With the information in these activities, one performance metric could be derived to determine the ratio of payload to overhead thereby helping designers. Many different kinds of performance metrics are possible using this approach.
By producing these metrics, designers and developers of web pages can judge the effects of changes to their application relative to efficiency and performance. Different applications issuing requests and rendering the responses (e.g., browsers, handheld personal communications devices, internet-enable celluar phone equipment, etc.), as well as servicing these requests (e.g., web servers) can also be compared and contrasted using these metrics. Furthermore, these metrics may serve as inputs to planning models used to project capacity, throughput, response time, and availability of the various web applications.