The present invention relates to client-server network computing, and, in particular, to methods, systems, and machine readable programming for interposing front end servers between servers and clients in such network computing.
As the use of computer networks expands, client-server computing has become increasingly important. In client-server computing client computers access, over a computer network, information and computational resources available on a server computer. In client-server computing, the server computer runs one or more server software programs which provide clients access to centralized resources. These server programs are typically pre-written object code programs which are purchased or acquired from a third party. Thus, they usually cannot be directly modified by their end users. Programs typically called front end servers or middle-ware exist for providing enhancements to these servers, but they require the administrator of a particular computer to reconfigure the server""s parameters and to support the resulting increase in system complexity.
The World Wide Web (Web) on the Internet is an extremely popular large scale networked system consisting largely of Web client programs on client computers, and Web server programs on server computers. In addition to server programs, most Web server computers also contain application server programs, such as CGI programs, which are called by server programs to handle certain types of client requests; and stored data, including Web pages and graphic files. It is possible for one computer to function both as a client and a server computer.
In the Web, client programs are commonly called browsers and the two most popular are Netscape Navigator and Microsoft Internet Explorer. The server programs are called Web servers (as opposed to the many of other server program types on the Internet, such as FTP servers, mail servers, etc.) and they can host one or more Web sites on each computer. Some of the more popular Web servers are the free Apache server and commercial ones from Netscape and Microsoft.
Like any heavily used program, the efficiency of Web servers is a major concern of the managers and administrators of Web sites. The reasons for concern are several. Web users have come to expect a certain level of dynamic response from the Web. Such users who perceive a site to be slow often will not visit it again. Sometimes a very heavily loaded Web server system can almost stop responding as it thrashes about doing I/O and doing very little useful work. When a server is thrashing, requests are accepted and serviced at a lower rate further degrading performance in a vicious cycle. Extreme load on a server can result in stability problems, crashes, and increased cost of maintaining the server.
There are many causes for inefficiency of Web servers. The following are some of the major ones and some of the current solutions for them.
Each request to a Web server from a browser using the HTTP 1.0 protocol, currently in common use, requires a new TCP/IP connection. This slows down the Web server, because it requires the server to accept and manage more connections. It also slows down the Internet itself, because it requires more information to be transported to establish the new connections for each separate client request. This problem is made worse by the fact that each separate object displayed in a web page, such as each separate graphic image, is usually obtained by a separate client request, meaning that accessing one web page commonly requires many separate connections.
The new HTTP/1.1 protocol helps solve this problem by multiplexing requests and responses from a given client over a single client connection. Some Web servers now being sold support this new protocol, but not all do. Furthermore, a large number of servers currently in use do not support this new protocol, and will need to be upgraded to do so. These upgrades can be costly in terms of purchase cost, administration expense, and down time. Such costs could slow the usage of the HTTP/1.1 protocol, or of any new, improved protocol.
Another factor which slows server performance, is the lack of caching. Many servers in current use, and even some still being sold, read each Web file requested by a client into memory each separate time it is requested. A caching Web server would normally keep a copy of many requested web data objects in memory. If a client request is for one of the data objects stored in its memory, the server saves the time of having to read the requested data object from disk into memory, and it can send a copy of the page directly from the memory to the requesting client, saving considerable time.
There are a wide variety of caching Web servers on the market, some free and some commercial. A few are integrated with the Web servers itself. Most are separate servers which act as a caching front end, which is interposed between clients and the server they work with, which we will call a back end server. These front end servers have to transfer non-cached requests to, and do I/O between clients and, the back end server.
A difficulty with prior caching front ends is that they normally require the back end server to be reconfigured to communicate with them, instead of directly with clients. Such back end reconfiguration can also require modification of the Web pages and CGI programs used by the back end, because the caching front end has to use the TCP numbers previously used by the back end, requiring that the back end used new TCP number. Such reconfiguration of the back end may also require routing information and other network configurations to be changed.
CGI stands for Common Gate Way Interface. It enables a Web server to call programs, often referred to as CGI scripts, to handle requests by a client which involve more than the mere sending of previously recorded data objects. A Web server typically handles a CGI request by forking off a new process of the desired CGI program and piping data to and from that new process. It is often desirable to maintain xe2x80x9csession affinityxe2x80x9d with CGI scripts, that is, to cause each of a successions of requests from a given client to be handled by the same forked CGI process. This is important because CGI scripts often handle the processing of individual transactions over the Web which require related multiple client requests, such as the filling out of multi-page registration forms or the placing of multiple articles into an electronic shopping cart and then purchasing them with a credit card. Many prior art servers, particularly those using HTTP 1.0 protocol maintain session affinity by use of cookies, i.e., data recorded in a client which the client will place in the headers of subsequent requests to the server to uniquely identify the client as one to be associated with a given CGI process. Unfortunately, using cookies to maintain session affinity requires the extra overhead of writing cookie information to clients, keeping track of which client has been sent which cookie, the parsing the cookie information from a client request, and looking that information up in the cookie/CGI-session data base to determine to which CGI session a given request should be sent.
There are many other functional improvements which could be made to many servers, whether they be Web servers or other types of servers. But the ability of those who operate servers to make such improvements is often hindered by the fact, referred to above, that making changes to a server computer""s configuration can be costly. New server programs costs money. Perhaps even more daunting is the fact that changing server programs can often require that many time consuming changes be made to web pages and CGI scripts. The prior art has used front end servers to increase the functionality of back end servers, but usually at a considerable cost in terms of the reconfiguration required to use the added functionality such front end servers provide.
This invention relates to methods, computer systems, and programming recorded on machine readable memory which cause a front end server providing additional capability to be interposed between a back end server and client computers. It enables the front end server to be interposed by using an interposed dynamically-loaded library linked to the back end server. The library responds to certain network operating system (xe2x80x9cOSxe2x80x9d) calls from the back end. Its major function is to communicate the. information in the intercepted calls to the front end server through interprocess communication links, also known as pipes. The front end server is programmed to respond to such information sent to it by establishing connections with clients; reading requests from, and writing responses, to clients; and relaying information relating to such connections, requests, and response to and from the back end server through the pipes with the library or back end.
In many embodiments of the invention, the library is programmed to return from each back end server call it intercepts with information in the same format as if the intercepted back end call had been executed directly by the OS, without the intervention of the library or front end server. This enables back end server communications with clients to be filtered or modified by the front end without the back end knowing it. Thus, it enables functionalityxe2x80x94such as caching, protocol conversion, provision of session affinity, allocation of communications load between multiple back end server processes, and filtering of client requests by type of service or client addressxe2x80x94to be performed without requiring any change to the back end server other than linking it to the interposed library.
In some embodiments of the invention, one of the calls intercepted by the interposed library is an OS call by the back end to accept a connection from a remote client. The library communicates the call to a front end server. The front end accepts a connection from a client and communicates a file descriptor representing the client connection to the library, and the library returns the file descriptor to the back end in standard format as if the connection had been accepted directly by the OS. In some cases, the file descriptor returned connects to a relay pipe which can be used to communicate between the back and front end servers. In some cases, file descriptor returned will connected directly to the client. In some cases the front end selects which client requests are appropriate for handling by the back end server, and, in some such cases, the front end delays sending a file descriptor to the library for return to a back end accept call, thus delaying the library""s return from that call, until the front end reads or peeks at the request from a client on the connection to determines whether it is appropriate for the back end server to handle the request.
In some embodiments of the invention, the front end caches data objects requested by clients; and if a request is for a data object in its cache, the front end sends the object directly from the cache to the client. If the request is for a cachable data object not in its cache, the front end sends the library a file descriptor connecting to a relay pipe connected to the front end, and writes the client request to that pipe. When the front end receives the requested data object from the back end over the relay pipe in response to the client request, it caches the object and sends it to the client. Preferably, if the requested object is in its cache, the front end neither sends a file descriptor to the library for return to a back end accept call, nor writes the client request to a relay pipe. In some embodiments, if the requested object is not of a type to be cached by the front end, the front end sends the library a socket connected directly to the client, so the back end""s reading of the request and writing of its response does not go through the front end.
In some embodiments of the invention, the front end server communicates with the back end server using a first protocol, communicates with the client using a second protocol, and performs conversion between protocols when relaying communications between the back end and clients. In some embodiments of the invention, the protocol used between the front and back ends, and that used between the front end and clients are totally different. In other embodiments, the two protocols can be different versions of the same protocol, such as, for example, an older version of a protocol, and a new version of the same protocol. In some embodiments, the front end can communicate with client""s using a protocol, such as HTTP/1.1, which responds to each of a plurality of client requests using a single sustained socket, and communicate with a back end server according to a protocol, such as HTTP/1.0, which requires that a separate connection be created, read from, written to, and closed for each separate client request, even if they come from the same client within seconds or minutes of each other. In some such embodiments, for each client request the front end receives over a single sustained client connection, it creates a new relay pipe, communicates the file descriptor of that relay pipe to the library for return to the back end, writes the client request to relay pipe, reads the back end""s response to the request from the relay pipe, closes the relay pipe, and writes the response to the client over the sustained connection.
In some embodiments of the invention, the front end responds to successive requests from a given client for service from a given application server, such as a CGI server, by calling the same process of that application server directly, rather than sending the request to the back end.
In some cases, the front end filters communications between clients and the back end server. F or example, in some embodiments it filters communications from client based on information about the client. In some embodiments, it filters requests from clients based on the type of request.
In some embodiments of the invention, the front end updates recorded information about communications with clients, such as, the number of requests from various portions of the network or the number of different types of requests. In some embodiments, the front end allocates load between multiple back end server processes, either of the same or of a different program.