This invention relates to the field of performance analysis, and in particular to a system and method for evaluating transactions associated with accessing and interacting with a web page.
The performance of a web page will have an impact on the number of visits to the page, as well as consumer satisfaction with the organization associated with the web page. If there is a noticeable delay while downloading a web page, the user may choose to terminate the access and browse a different web site. If the user must access this particular page, the user's frustration with web page access will affect the user's opinion of the organization associated with the web page. In like manner, if providing a web page is inefficient, the provider of the web page may have to invest in additional server equipment to counteract this inefficiency. Thus, there is a need on the part of the service provider and the web page provider to monitor transactions associated with access to their web pages.
Analyzing the performance of transactions on a web page, however, is not a straightforward task. Not only is the performance based on a variety of factors that are not related directly to the web page, such as the bandwidth, congestion, and load on the various links of the network between the web server and the client device, but also, the collection of meaningful information is hindered by the ‘stateless’ nature of web page access using such protocols as HTTP (Hypertext Transfer Protocol) and HTTPS (HTTP Secure).
Most web pages include a variety of elements, such as pictures, text, interactive buttons, and so on. HTML (Hypertext Markup Language) is generally used to construct the web page, as well as to identify or describe each of the individual elements. When the web page is downloaded to the client browser, the browser finds each of the HTML identifiers (generally, a URL (Universal Resource Locator) that identifies a location on the web at which the corresponding element is located), and initiates a download of the element from that location. These downloaded elements may also contain other HTML identifiers, and the browser iteratively downloads each of these subsequently identified elements.
From a network perspective, the downloading of the web page and each of the elements are substantially independent events. That is, the state of the overall downloading sequence is contained solely within the client browser, and not reflected in the individual downloads of these elements. A browser may choose, for example, not to download picture elements, even though the page, or included elements, may identify such picture elements. In like manner, the browser may at one time download a video clip, and at another time, download only an image of the clip, waiting for the user to select the image before performing the actual download of the clip. That is, knowing that a page contains particular elements does not, per se, define actions subsequent to the downloading of the page.
To evaluate the performance of a web page, one needs to determine which downloads (or other actions) are associated with the client browser's processing of that web page. However, because the relationships among those actions (i.e. states within the browser) are not apparent in the corresponding network communications, monitoring the network communications provides very little information regarding the actual performance of the web page. Accordingly, service providers and web page providers are effectively ‘blind’ to any potential problems, and are unable to perform effective analysis and diagnostics.
It would be advantageous to be able to evaluate the performance of a web page based on monitored network traffic. It would also be advantageous to be able to analyze and diagnose the performance of particular activities associated with interactions with elements of the web page.
These advantages, and others, can be realized by a processing method and system that correlates individual network activities corresponding to individual interactions with the web page. This correlation is preferably performed using a combination of heuristics and rules developed to filter network activities into those activities that are likely to have been caused by the particular transaction, and those that are unlikely to be associated with that transaction. The activities that are identified as being associated with the transaction are subsequently organized to identify a time-flow of these activities within the transaction, from which performance statistics can be determined and presented to a user. Because the individual activities within the transaction are identified and time-ordered, an analysis of the effects of each activity on the overall performance can be performed to identify potential problem areas, or to diagnose reported problems.
Throughout the drawings, the same reference numerals indicate similar or corresponding features or functions. The drawings are included for illustrative purposes and are not intended to limit the scope of the invention.