1. Field of the Invention
The present invention relates generally to computer software. More particularly, the present invention relates to methods and apparatus for enabling a web server to transport data to an in-kernel HTTP cache.
2. Description of Related Art
FIG. 1 is a block diagram illustrating a conventional web server 104. Through the use of a web browser and the web server 104, a user may access a web page on the Internet. In this manner, multiple Hypertext Transfer Protocol (HTTP) clients (e.g., web browsers) 100, 102 may access files via the single web server 104. Typically, a browser user enters HTTP file requests by either xe2x80x9copeningxe2x80x9d a Web file (e.g., typing in a Uniform Resource Locator or URL) or clicking on a hypertext link. The browser builds a HTTP request and sends it to the Internet Protocol (IP) address indicated by the URL. When the web browser 100 or 102 sends a HTTP request to the web server 104 identified by the IP address, the web server 104 receives the request and, after any necessary processing, the requested file (i.e., HTTP response data) is returned.
Within the web server 104, HTTP requests that are received are processed by a HTTP daemon 105. The HTTP daemon 105 is a program that runs continuously and exists for the purpose of handling HTTP requests. The HTTP daemon 105 forwards the HTTP requests to other programs or processes as appropriate. Thus, each web server has a HTTP daemon 105 that continually waits for requests to come in from Web clients and their users. Once a file (i.e., HTTP response data) is obtained (e.g., from an associated memory 106), the data is transmitted to the client 100 or 102 that requested the data.
HTTP requests are typically initially handled by a kernel 107 that is responsible for forwarding the requests from the client 100 or 102 to the HTTP daemon 105. The kernel 107 is the essential center of a computer operating system, the core that provides basic services for all other parts of the operating system. Typically, a kernel includes an interrupt handler that handles all requests or completed I/O operations that compete for the kernel""s services, a scheduler that determines which programs share the kernel""s processing time in what order, and a supervisor that actually gives use of the computer to each process when it is scheduled. The kernel 107 may also include a manager of the operating system""s address spaces in memory, sharing these among all components and other users of the kernel""s services. A kernel""s services are requested by other parts of the operating system or by applications through a specified set of program interfaces sometimes known as system calls. The kernel 107 provides services such as buffer management, message routing, and standardized interfaces to protocols which enable data to be routed between a client and a server.
The kernel structure consists of three layers: a socket layer 108, a protocol layer 110, and a device layer 111. The socket layer 108 supplies the interface between the HTTP daemon 105 and lower layers, the protocol layer 110 contains protocol modules for communication, and the device layer 111 contains device drivers that control network devices. Thus, a server and client process may communicate with one another through the socket layer 108.
Conventional Unix network input/output is provided through the use of a file descriptor opened on a socket. A file descriptor is typically an integer that identifies an open file within a process which is obtained as a result of opening the file. In other words, a separate socket is required for each network connection. Thus, as shown, each network connection corresponding to a client has an associated socket layer 112 and protocol layer 114, which may send data via a network interface card 116 via a transmission medium 118 to one or more clients 100, 102. Each socket has its own socket data structure. Since a separate file descriptor is opened on a socket for each network connection, in-kernel resources are unnecessarily consumed. Moreover, there are limits to the number of file descriptors that may be opened at a particular instant in time. In addition, the data types that can be transported by a socket are limited and therefore the speed with which data can be outputted by the web server onto the network is reduced. For instance, a socket typically transports a byte stream. While various xe2x80x9csendfilexe2x80x9d mechanisms exist which allow a HTTP daemon to specify a file to be outputted to the network, no mechanism exists which enables a variety of data types to be specified and transported. It is also important to note that in a Unix system, any persistent resources (e.g., file, shared memory segment, in-kernel cached object) which are to be outputted to the network must be copied as a byte-stream via a buffered write thus causing at least one extra copy of the data to occur. Accordingly, the speed with which data is transported onto the network to a client is reduced.
In view of the above, it would be desirable to enable a web server to transport response data associated with a HTTP request to a client with a minimum of memory and processing resources. Moreover, it would be beneficial if input/output between a client and a web server could be accelerated. In addition, it would be preferable if such a system could be implemented on a Unix network.
An invention is disclosed herein that transports data in a web server. This is accomplished through the use of a data transport module in communication with a HTTP daemon. In this manner, data may be transported effectively between the HTTP daemon and the data transport module as well as to the client requesting the data.
In accordance with one aspect of the invention, a HTTP request including HTTP request data is received by a data transport module from a client. The HTTP request data may be sent with a preempt indicator from the data transport module to a HTTP daemon. The preempt indicator indicates whether processing is preempted from the data transport module to the HTTP daemon. Similarly, when a HTTP response is returned from the HTTP daemon to the data transport module, HTTP response data may be sent with a preempt indicator from the HTTP daemon to the data transport module indicating whether processing is preempted from the HTTP daemon to the data transport module.
In accordance with another aspect of the invention, an identifier identifying the HTTP response data may be returned to the data transport module which identifies the response and enables the data transport module to independently access the response for transmission to a client. In addition, a data type (e.g., shared memory segment, cached response) associated with this directly accessible data may be specified. In this manner, a variety of types of data may be communicated to the data transport module without transporting the data from the HTTP daemon to the data transport module. For instance, the identifier may identify a response stored in an in-kernel cache accessible to the data transport module. Moreover, encapsulation information may be provided to the data transport module indicating whether the response data must be encapsulated prior to transmission to the client and, if so, a method of encapsulation.