1. Field of the Invention
The present invention relates generally to computer software. More particularly, the present invention relates to methods and apparatus for providing an in-kernel interface to a web server.
2. Description of Related Art
FIG. 1 is a block diagram illustrating a conventional web server 104. Through the use of a web browser and the web server 104, a user may access a web page stored on the web server 104 through the Internet. In this manner, multiple Hypertext Transfer Protocol (HTTP) clients (e.g., web browsers) 100, 102 may access files via the single web server 104. Typically, a browser user enters HTTP file requests by either “opening” a Web file (e.g., typing in a Uniform Resource Locator or URL) or clicking on a hypertext link. The browser builds a HTTP request and sends it to the Internet Protocol (IP) address indicated by the URL. When the web browser 100 or 102 sends a HTTP request to the web server 104 identified by the IP address, the web server 104 receives the request and, after any necessary processing, the requested file (i.e., HTTP response data) is returned.
Within the web server 104, HTTP requests that are received are processed by a HTTP daemon 105. The HTTP daemon 105 is a program that runs continuously on the web server 104 and exists for the purpose of handling HTTP requests. The HTTP daemon 105 forwards the received HTTP requests to other programs or processes as appropriate. Thus, each web server has a HTTP daemon 105 that continually waits for requests to come in from Web clients and their users. Once a file (i.e., HTTP response data) is obtained (e.g., from an associated web server cache memory 106), the data is transmitted by the daemon 105 to the client 100 or 102 that requested the data. In addition, the web server cache 106 is often used to store HTTP response data. As an alternative to cache memory, the HTTP daemon 105 may have other storage media associated with it. Such media, for example, can include a hard drive.
HTTP requests are typically initially handled by a kernel 107 that is responsible for forwarding the requests from the client 100 or 102 to the HTTP daemon 105. The kernel 107 is the essential central part of a computer operating system, the core that provides basic services for all other parts of the operating system. Typically, a kernel includes an interrupt handler that handles all requests or completed I/O operations that compete for the kernel's services, a scheduler that determines which programs share the kernel's processing time in what order, and a supervisor that actually gives use of the computer to each process when it is scheduled. The kernel 107 may also include a manager of the operating system's memory address spaces, sharing these among all components and other users of the kernel's services. A kernel's services are requested by other parts of the operating system or by applications through a specified set of program interfaces sometimes known as system calls. The kernel 107 also provides services such as buffer management, message routing, and standardized interfaces to protocols which enable data to be routed between clients 100, 102 and a server 104.
As it applies to handling server/client communications, the kernel structure consists of three layers: a socket layer 108, a protocol layer 110, and a device layer 111. The socket layer 108 supplies the interface between the HTTP daemon 105 and lower (protocol and device) layers, the protocol layer 110 contains protocol modules for communication, and the device layer 111 contains device drivers that control network devices. Thus, a server and client process may communicate with one another through the socket layer 108. More particularly, a socket file system 109 (SOCKFS) is associated with the socket layer 108 and is adapted for managing the socket layer 108.
Conventional Unix network input/output is provided through the use of a file descriptor opened on a socket. A “socket” is a method for communication between a client program and a server program in a network. A socket is defined as “the endpoint in a connection.” Sockets are created and used with a set of programming requests or “function calls” sometimes called the sockets application programming interface (API). A file descriptor is typically an integer that identifies an open file within a process which is obtained as a result of opening the file. In other words, a separate socket is required for each network connection. Thus, as shown, each network connection corresponding to a client request has an associated socket layer 112 and protocol layer 114, which may send data via a network interface card 116 via a transmission medium 118 to one or more clients 100, 102. Each socket has its own socket data structure. Since a separate file descriptor is opened on a socket for each network connection, in-kernel resources are unnecessarily consumed. Moreover, there are limits to the number of file descriptors that may be opened at a particular instant in time.
STREAMS is a general, flexible programming model for Unix system communication services. STREAMS defines standard interfaces for character input/output (I/O) within the kernel, and between the kernel and the rest of the UNIX system. The mechanism consists of a set of system calls, kernel resources, and kernel routines. STREAMS enables the creation of modules to provide standard data communications services. A STREAMS module is a defined set of kernel-level routines and data structures. From the application level, modules can be dynamically selected and interconnected. No kernel programming, compiling, and link editing are required to create the interconnection. STREAMS provides an effective environment for kernel services and drivers requiring modularity. STREAMS parallels the layering model found in networking protocols.
A stream is a data path that passes data in both directions between a STREAMS driver in kernel-space and a process in user space. An application creates a stream by opening a STREAMS device. When a STREAMS device is first opened, the stream consists of only a stream head and a STREAMS driver. A STREAMS driver is a device driver that implements the STREAMS interface. A STREAMS device driver exists below the stream head and any modules. It can act on an external I/O device, or it can be an internal software driver, called a pseudo-device driver. A stream-head is the end of the stream nearest the user process. It is the interface between the stream and the user process. The STREAMS device driver transfers data between the kernel and the device. STREAMS enables the manipulation of the modules on a stream.
In order for the TCP protocol layer to communicate with the HTTP daemon, a new stream is typically created for each connection. Since a stream is associated with a single connection, the stream does not include identifying information that identifies the connection. On the contrary, since a separate stream is opened for each connection, such identifying information is stored in association with the connection (e.g., by the TCP protocol layer and by the SOCKFS). This private state which uniquely identifies the connection includes information such as a remote IP address, a remote port, a local IP address, and a local port. It is important to note that since such identifying information is not included in the stream, data for only a single connection may be sent in the stream. As a result, multiple streams must be created in order to transmit HTTP request data from a client to the HTTP daemon. Since it is difficult to pre-create such streams, this stream creation is preferably performed dynamically. However, numerous steps must be performed before data can be sent in a data stream.
FIG. 2 is an exemplary diagram illustrating data flow associated with a HTTP request. TCP protocol layer is represented by line 202, socket file system (SOCKFS) is represented by line 204, and HTTP daemon is represented by line 206. As shown, for each connection 208, the TCP protocol layer 202 sends a connection indication 210 to the socket file system 204. The SOCKFS 204 then sends a connection indication 212 to the HTTP daemon 206. The HTTP daemon 206 accepts the connection by sending a message as shown at 214 to the SOCKFS 204. The SOCKFS then sends an acknowledgement 216 to the TCP protocol layer 202. Once the acknowledgement 216 has been received by the TCP protocol layer 202, data 218 is sent to the SOCKFS 204. The data is then transmitted as shown at 220 to the HTTP daemon 206. The HTTP daemon 206 then accepts the data as shown at 222. The SOCKFS 204 then sends an acknowledgement 224 to the TCP protocol layer 202. As shown and described with reference to FIG. 2, multiple steps must typically be performed in order to transmit HTTP request data to a web server. Accordingly, it would be beneficial if the time required to send HTTP request data to a web server could be reduced while minimizing memory and processing resources. Similarly, it would be desirable if HTTP response data could be sent in a compatible and efficient manner from the web server to the requesting client. In addition, it would be preferable if such a system could be implemented without requiring modifications to the HTTP daemon.