A desire often exists for collecting network information for transactions in a client-server network. For example, it may be desirable to collect network information for transactions to enable measurement of the performance of such transactions. As described in greater detail hereafter, clients and servers generally interact through transactions. That is, a client typically communicates a request to the server for desired information and the server communicates a response (e.g., which may include the requested information) to the client. Such client request and corresponding server response comprises a “transaction.” Thus, it may be desirable to collect information about such transactions to, for example, enable measurement of the performance of such transactions.
One popular client-server network is the Internet. The Internet is a packet-switched network, which means that when information is sent across the Internet from one computer to another, the data is broken into small packets. A series of switches called routers send each packet across the network individually. After all of the packets arrive at the receiving computer, they are recombined into their original, unified form. TCP/IP is a protocol commonly used for communicating the packets of data. In TCP/IP, two protocols do the work of breaking the data into packets, routing the packets across the Internet, and then recombining them on the other end: 1) the Internet Protocol (IP), which routes the data, and 2) the Transmission Control Protocol (TCP), which breaks the data into packets and recombines them on the computer that receives the information. TCP/IP is well known in the existing art, and therefore is not described in further detail herein.
One popular part of the Internet is the World Wide Web (which may be referred to herein simply as the “web”). Computers (or “servers”) that provide information on the web are typically called “websites.” Services offered by service providers' websites are obtained by clients via the web by downloading web pages from such websites to a browser executing on the client. For example, a user may use a computer (e.g., personal computer, laptop computer, workstation, personal digital assistant, cellular telephone, or other processor-based device capable of accessing the Internet) to access the Internet (e.g., via a conventional modem, cable modem, Digital Subscriber Line (DSL) connection, or the like). A browser, such as NETSCAPE NAVIGATOR developed by NETSCAPE, INC. or MICROSOFT INTERNET EXPLORER developed by MICROSOFT CORPORATION, as examples, may be executing on the user's computer to enable a user to input information requesting to access a particular website and to output information (e.g., web pages) received from an accessed website.
In general, a web page is typically composed of a mark-up language file, such as a HyperText Mark-up Language (HTML), Extensible Mark-up Language (XML), Handheld Device Mark-up Language (HDML), or Wireless Mark-up Language (WML) file, and several embedded objects, such as images. A browser retrieves a web page by issuing a series of HyperText Transfer Protocol (HTTP) requests for all objects. As is well known, HTTP is the underlying protocol used by the World Wide Web. The HTTP requests can be sent through one persistent TCP connection or multiple concurrent connections.
As mentioned above, a desire often exists for collecting network information for transactions in a client-server network. For example, it may be desirable to collect information for transactions for measuring the performance of such transactions. Web server access logs are widely used in the existing art to collect information for analyzing site performance. However, web server logs do not provide enough information to derive advanced performance metrics, such as the end-to-end response time observed by clients. In general, web server access logs are collected by web server software executing in the user-space.
Additionally, network-level collection tools are available in the existing art for collecting network information. Such network collection tools include the publicly available UNIX tool known as “tcpdump” and the publicly available WINDOWS tool known as “WinDump.” The software tools “tcpdump” and “WinDump” are well-known and are commonly used in the networking arts for capturing network-level information for network “sniffer/analyzer” applications. Typically, such tools are used to capture network-level information for monitoring security on a computer network (e.g., to detect unauthorized intruders, or “hackers”, in a system).
The collection tools of the existing art, such as tcpdump, capture raw network information that is not organized into transactions. That is, the raw network information may include interleaved information for various different transactions. Techniques have been proposed for reconstructing the raw network information captured by a collection tool, such as tcpdump, into their corresponding transactions. For instance, a methodology for rebuilding HTTP transactions from TCP-level traces was proposed by Anja Feldmann in “BLT: Bi-Layer Tracing of HTTP and TCP/IP”, Proceedings of WWW-9, May 2000, the disclosure of which is hereby incorporated herein by reference. Balachander Krishnamurthy and Jennifer Rexford explain this mechanism in more detail and extend this solution to rebuild HTTP transactions for persistent connections in “Web Protocols and Practice: HTTP/1.1, Networking Protocols, Caching, and Traffic Measurement” pp. 511-522, Addison Wesley, 2001, the disclosure of which is also hereby incorporated herein by reference.
Accordingly, raw network information may be collected using a network-level collection tool, such as tcpdump, and such raw network information is stored in the user-space (e.g., stored to disk). Thereafter, the raw network information may be analyzed by a program executing in the user-space to organize the raw network information into corresponding network transactions (request-response pairs) using, for example, the methodology proposed by Anja Feldmann.