1. Field of the Invention
The present invention pertains to the Internet. More particularly, this invention relates to a traffic-aware request processing scheme for TCP/IP-based network applications.
2. Description of the Related Art
With the rapid growth of the Internet, more and more business and residential users are beginning to rely on the Internet for their mainstream and mission-critical activities. As is known, the Internet typically refers to a number of data service systems connected together via a high speed interconnect network (see FIG. 1). Each data service system typically includes Internet server applications that host contents for various customers. The Internet server applications can also host applications. Remote user terminals (e.g., terminals 11a–11n in FIG. 1) may be connected to a data service system (e.g., the data service system 20 in FIG. 1) via an interconnect network. Each user terminal is equipped with a web browser (or other software such as an e-mail software) that allows the user terminal to access the contents and/or applications hosted in various data service systems.
Popular Internet applications include World Wide Web (WWW), E-mail, news, and FTP applications. Other emerging applications such as E-commerce payment processing, content distribution allow various data service systems (e.g., data service systems 20 and 13 in FIG. 1) to interact with one another. All of these applications follow the client-server model and rely on the Transmission Control Protocol (TCP) for reliable delivery of information/applications between severs and clients. New connection requests received by a data service system (e.g., the system 20 in FIG. 2) are first processed by a TCP/IP stack which is part of the data service system's kernel (i.e., operating system). FIG. 2 shows that the kernel 21 is external to the server application 25 that processes the new connection requests received. The TCP/IP stack in the kernel 21 holds the new connection requests in TCP listen queues, one queue per port. The maximum number of requests that can be held in a listen queue is a configurable parameter. When a server application is ready to process a new request, the server application accepts a new request from its associated listen queue. At this time, the new request is removed from the listen queue.
Such a prior art scheme, however, bears a number of disadvantages. One disadvantage is that the scheme only processes the queued requests sequentially. This means that the server application first accepts a connection request from the external listen queue, processes the accepted connection request, and then proceeds to pick up the next connection request. Even when the server application has multiple processing threads, each thread accepts and processes connection requests from the listen queue sequentially. This sequential feature, although simple to implement, has two key drawbacks. The first one is the long response time of the server application to the connection requests when the server application is overloaded. Since a server application processes incoming requests in sequence, it is typically unaware of the number of requests that are awaiting processing in the listen queue at any given time. When the server application is overloaded (i.e., the rate of incoming requests exceeds the processing rate of the application server), more and more requests accumulate in the listen queue. When the listen queue is full, new requests are dropped from listen queue while the server application is totally unaware of such drops. Dropped connection requests typically cause timeouts at the TCP layer of the client. Since a TCP timeout typically lasts for several seconds, dropped connection requests result in very poor response time for the user/client applications accessing the server.
The second drawback is that a request may be dropped without being first considered by the server application to determine whether the request should be serviced or not. This drawback is particularly evident when the server application is capable of classifying the incoming connection requests into basic and premium classes and offering different treatment to requests in each class. For example, requests of the premium class could be handled at higher priority. The classification, however, is typically done in the server application and TCP/IP stack is not aware of the classification. When overflows happens in the listen queue, both basic and premium requests are dropped out of the listen queue. In this case, premium users do not receive the expected level of performance.