A network is typically used for data transport among devices at network nodes distributed over the network. Some networks are considered “local area networks” (LANs), others are considered “wide area networks” (WANs), although not all networks are so categorized and others might have both LAN and WAN characteristics. Often, a LAN comprises nodes that are all controlled by a single organization and connected over dedicated, relatively reliable and physically short connections. An example might be a network in an office building for one company or division. By contrast, often a WAN comprises nodes that might include nodes over which many different organization's data flow, and might involve physically long connections. In one example, a LAN might be coupled to a global internetwork of networks referred to as the “Internet” such that traffic from one node on the LAN passes through the Internet to a remote LAN and then to a node on that remote LAN.
Data transport is often organized into “transactions”, wherein a device at one network node initiates a request for data from another device at another network node and the first device receives the data in a response from the other device. By convention, the initiator of a transaction is referred to herein as the “client” and the responder to the request from the client is referred to herein as the “server”. As used herein, “client” generally refers to a computer, computing device, peripheral, electronics, or the like, that makes a request for data or an action, while “server” generally refers to a computer, computing device, peripheral, electronics, or the like, that operates in response to requests for data or action made by one or more clients. Depending upon the context, a computer or other device may function as both a client and/or a server.
As explained above, a transaction over a network involves bidirectional communication between two computing entities, where one entity is the client and initiates a transaction by opening a network channel to another entity (the server). Typically, the client sends a request or set of requests via a set of networking protocols over that network channel, and the request or requests are processed by the server, returning responses. Many protocols are “connection-based”, whereby the two cooperating entities (sometimes known as “hosts”) negotiate a communication session to begin the information exchange. In setting up a communication session, the client and the server might each maintain state information for the session, which may include information about the capabilities of each other. At some level, the session forms what is logically (or physically, in some cases) considered a “connection” between the client and server. Once the connection is established, communication between the client and server can proceed using state from the session establishment and other information and send messages between the client and the server, wherein a message is a data set comprising a plurality of bits in a sequence, possibly packaged as one or more packets according to an underlying network protocol. Typically, once the client and the server agree that the session is over, each side disposes of the state information for that transaction, other than possibly saving log information.
A client makes requests to a server, which typically delivers a response to each request back to the client. McCanne I and McCanne III describe how a network proxy communicating with one or more peer network proxies can offer valuable forms of transaction acceleration and traffic reduction. In such cases, for example, a client's request can be intercepted by a client-side network proxy and delivered to the server by a server-side network proxy. The request may be transformed or processed by the two proxies so that it (and possibly future requests) is more effectively transported across the intervening network than would be true without the use of the cooperating network proxies.
A message from a client to a server or vice-versa traverses one or more network “paths” connecting the client and server. A basic path would be a physical cable connecting the two hosts. More typically, a path involves a number of physical communication links and a number of intermediate devices (e.g., routers) that are able to transmit a packet along a correct path to the server, and transmit the response packets from the server back to the client. These intermediate devices typically do not modify the contents of a data packet; they simply pass the packet on in a correct direction. However, it is possible that a device that is in the network path between a client and a server could modify a data packet along the way. To avoid violating the semantics of the networking protocols, any such modifications should not alter how the packet is eventually processed by the destination host.
A network proxy is a transport-level or application-level entity that functions as a performance-enhancing intermediary between the client and the server. In this case, a proxy is the terminus for the client connection and initiates another connection to the server on behalf of the client. Alternatively, the proxy connects to one or more other proxies that in turn connect to the server. Each proxy may forward, modify, or otherwise transform the transactions as they flow from the client to the server and vice versa. Examples of proxies include (1) Web proxies that enhance performance through caching or enhance security by controlling access to servers, (2) mail relays that forward mail from a client to another mail server, (3) DNS relays that cache DNS name resolutions, and so forth.
One problem that must be overcome when deploying proxies is that of directing client requests to the proxy instead of to the destination server. One mechanism for accomplishing this is to configure each client host or process with the network address information of the proxy. This requires that the client application have an explicit proxy capability, whereby the client can be configured to direct requests to the proxy instead of to the server. In addition, this type of deployment requires that all clients must be explicitly configured and that can be an administrative burden on a network administrator.
One way around the problems of explicit proxy configuration is to deploy a “transparent proxy”. The presence of the transparent proxy is not made explicitly known to the client process, so all client requests proceed along the network path towards the server as they would have if there were no transparent proxy. Some benefits of a transparent proxy require that a proxy pair exist in the network path. For example, if a proxy is used to transform data in some way, a second proxy preferably untransforms the data. For actions that require a proxy pair, preferably both proxies in the proxy pair do not perform a transformation unless they can be assured of the existence and operation of the other proxy in the proxy pair. Where each proxy must be explicitly configured with indications of the pairs to which it belongs and to the identity of the other members of those pairs, the administrative burden on a network administrator might well make some operations infeasible if they require proxy pairs. Even where a proxy is interposed in a network and gets all of the traffic from a client or server, it still must discover the other member for each proxy pair the proxy needs, if the proxy is to perform actions that require proxy pairs.
As used herein, “proxy pairing” is a process of associating two proxies. The two proxies are members of a proxy pair and each member of a proxy pair is aware of the other member of the proxy pair and knows its address (or other identifier). A given proxy can be a member of more than one proxy pair. Where a given proxy is a member of a plurality of proxy pairs, the other members of those proxy pairs can be distinct or can be duplicative, i.e., there might be more than one proxy pair that has the same two members. In some cases, a proxy pair might be generalized to a “proxy grouping” of more than two proxies for purposes equivalent to what a proxy pair might do.
Generally, a proxy pair exists in relation to one or more transactions. Thus, proxy A and proxy B might be paired for some transactions and not others. Often, two proxies are paired for all transactions between pairs of particular clients and particular servers. In most instances, a proxy pair comprises a client-side proxy (“CP”) and a server-side proxy (“SP”) and each member of the proxy pair is aware of which side (client or server) they are on.
The proxies in a proxy pair can become aware of the pair and the other member (and which side they are on) using techniques described in McCanne IV or other methods. Once the proxies in a proxy pair are aware of the pairing and the other member, the pair can intercept network transactions. With the pairing, the optimizations need not conform to the end-to-end network protocol, as each proxy can undo nonconforming operations of the other proxy.
However, where network traffic between client and server can pass through an arbitrary number of potential cooperating proxies, it is useful to be able to pick two specific proxies of those, such as two proxies that match some criteria. One useful choice is the two proxies that are “outermost”, i.e., closest to the client and closest to the server. Another useful choice is the two proxies that are on either side of the worst (slowest performing) connection or network.
Forms of proxy discovery described in McCanne IV include single instance probing and proxy chain probing. With single instance probing, a single client-side proxy (CP) probes for a single instance of a server-side proxy (SP). With proxy chain probing, a single client-side proxy establishes a proxy chain of multiple middle proxies (MPs) and a final SP.
Improvements for proxy pairing are desirable. For example, where some pairings are better than others, they should be selected for. Also, where marked probe packets are used by proxies to discover each other, those can be rejected by some servers and that should be dealt with so that client/server connections do not fail. A related multiple-proxy problem is where traffic does not have a consistent path through proxies, but instead some traffic passes through one proxy initially and some traffic passes through one or more others. Ly describes this problem as asymmetric routing. Ly provides some solutions for dealing with asymmetric routing.
It is therefore desirable for a system and method to facilitate the discovery and selection of optimal sets of cooperating proxies in the presence of multiple candidate proxies or despite confounding network issues, possibly allowing for connections to be established even when probe packets are rejected, even in the presence of multiple intermediate proxies, and possibly also automatically detecting asymmetric routing conditions and configuring proxies for connection forwarding as necessary.