Sharing of information has long been a goal of modern civilization. Information must be represented in some form in order that it can be transferred between people. Humans are gifted with five sensory modes of perception. Visual perception has been given the lion's share of the brain. It is noted by Dr Judith Guedalia, of the Neuro-Psychological Department of Shaare Zedek Medical Center in Jerusalem, Israel, that although the famous Helen Keller, who was both deaf and blind, was quoted as saying that she would prefer hearing over sight if she had to choose, this may have been based on psychological reasons, since group activity is better enabled with hearing.
Visual stimulation is perhaps the most efficient means for people to acquire information. The cliché “a picture is worth a thousand words” is well-founded. The ratio of brains cells devoted to sight versus those devoted to sound is roughly 100.1. From the days of cave carving to the printing press, visual information has been the primary medium for communication.
Recently information has taken on a digital form, which has enabled electronic transmission of information. The most notable example is the dissemination of information through the World Wide Web (referred to simply as the “Web”), which is a collection of millions of computers interconnected by electronic communication devices.
Various techniques are known for providing on-line access of interactive media over the Web.
The following U.S. Patents have been found in a U.S. Patent Search and are believed to be generally relevant to the field of the invention:    U.S. Pat. No. 4,897,867 January 1990 Foster et al.    U.S. Pat. No. 5,119,188 June 1992 McCalley et al.    U.S. Pat. No. 5,122,873 June 1992 Golin    U.S. Pat. No. 5,195,092 March 1993 Wilson et al.    U.S. Pat. No. 5,220,420 June 1993 Hoartv et al.    U.S. Pat. No. 5,236,199 August 1993 Thompson, Jr.    U.S. Pat. No. 5,251,209 October 1993 Jurkevich et al.    U.S. Pat. No. 5,265.248 November 1993 Moulios et al.    U.S. Pat. No. 5,283,819 January 1994 Glick et al.    U.S. Pat. No. 5,325,423 June 1994 Lewis    U.S. Pat. No. 5,351,276 September 1994 Doll. Jr. et al.    U.S. Pat. No. 5,363,482 November 1994 Victor et al.    U.S. Pat. No. 5,420,572 May 1995 Dolin, Jr. et al.    U.S. Pat. No. 5,420,801 May 1995 Dockter et al.    U.S. Pat. No. 5,438,658 August 1995 Fitzpatrick et al.    U.S. Pat. No. 5,487,167 January 1996 Dinallo et al.    U.S. Pat. No. 5,495,576 February 1996 Ritchey    U.S. Pat. No. 5,508,940 April 1996 Rossmere et al.    U.S. Pat. No. 5,519,435 May 1996 Anderson    U.S. Pat. No. 5,553,221 September 1996 Reimer et al.    U.S. Pat. No. 5,553,222 September 1996 Milne et al.    U.S. Pat. No. 5,557,538 September 1996 Retter et al.    U.S. Pat. No. 5,561,791 October 1996 Mendelson et al.    U.S. Pat. No. 5,564,001 October 1996 Lewis    U.S. Pat. No. 5,577,180 November 1996 Reed    U.S. Pat. No. 5,577,258 November 1996 Cruz et al.    U.S. Pat. No. 5,581,783 December 1996 Ohashi
Transportation of digital signals between computers is plagued by a bandwidth problem, where a limited bandwidth creates bottlenecks in the transmission of electronic signals. Fortunately, textual information can be represented efficiently electronically, and this has enabled the Web publishing industry to flourish. Unfortunately, however, image information is difficult to represent compactly in electronic form.
A server or host computer (hereinafter referred to as a “server”) is used for placing information where it is available for access by multiple users. The Web has enabled convenient publishing of information from a server to client computers (hereinafter referred to as “clients”) that are controlled by users requesting information.
When using various media such as video, audio, text and images, a user generally retrieves the media from a server connected via a network to many clients. The server downloads the media to the network and transmits it to the client computer at the user's request.
There are two basic limitations involved in such data retrieval: delay between the time that a user requests the data and the time when the server downloads it to the network, and bandwidth limitations on data throughput and rate of data transmission.
One example of such a server-client network is a network connecting Web servers and users' personal computers. Such networks are installed in order to facilitate convenient data transmission between users, and data distribution from the server to the users' computers.
Known network applications involve streaming data from a server to a client computer. “Streaming” refers to serial or parallel transmission of digital data between two computers, by transmitting sequences of bit packets. For example, installation executables on a network server stream files to a client computer performing the installation. Servers with large amounts of memory are used to archive digital movies, which are streamed to a client computer for viewing upon demand. Digital video is broadcast from cable stations to subscribers using streaming. Web browsers, such as Netscape and Microsoft Explorer, are used to stream data from a server on the Web to a client.
Web sites can contain enormous databases, such as phone directories for all of the cities in the U.S., photographs from art galleries and museums around the world, voluminous encyclopedias, and even copies of all patents ever issued by the U.S. Patent & Trademark Office. Users of the Web can search these databases and then request the server to download specific information. This request initiates a streaming event.
Accessing information over the Web is typically done using Web browsers. A Web browser is software that resides on a client computer and communicates with servers via established protocols. Information transmitted by servers is converted to visual displays on the client computer by means of a Web browser.
Internet protocols enable client-server communication over the Web. These protocols include low level protocols, such as Transaction Control Protocol/Internet Protocol (TCP/IP), and higher level protocols such as Hypertext Transfer Protocol (HTTP) A general reference on Internet protocols may be accessed on:http://www.w3.org/Protocols,orhttp://www.cis.ohio-state.edu/tbin/rfc/arpa-internet-protocols.html.Another useful reference is: Hethmon, Paul S., Illustrated Guide to HTTP, Manning Publications Co., Greenwich, Conn., 1997.
HTTP servers provide a way for remote clients to access data on servers. HTTP provides a method for making simple requests from a client to a server. Client requests can take the form of GET requests or POST requests in HTTP 1.0. Typically, in a GET or POST request the client specifies a file to be delivered, and through HTTP headers the server can specify what is being delivered. The most pervasive file format used on the Web is HyperText Markup Language (HTML). A reference on HTML may be accessed at:http://204.57.196.12/reference/htmlspec2.0.HTML files are typically relatively small, i.e. less than 100 Kbytes.
HTTP/1.0 specifies that a communication between a server and client proceeds as follows: A client's request is initiated with a header which is terminated by a double carriage return linefeed. This is followed by the body of the request which is similarly terminated by a double carriage return linefeed. The server responds with an HTTP header terminated with a carriage return linefeed and then sends the body of the response. The response is terminated when a socket is closed. This normally occurs when the server has finished returning data to the client, and the server closes the socket.
Server performance is generally inversely proportional to the quantity of data being served per unit time. The task of delivering a file from a server to a client is typically not computationally expensive. This task includes reading the file from the server's peripherals, e.g. a disk drive, and transmitting the data from the file in the specified protocol, e.g. TCP/IP. TCP/IP transmits data in units of “packets.” The time it takes for a client to retrieve a file depends upon the number of packets transmitted by the server.
One problem shared by virtually all server applications is the need to process many client requests simultaneously. Efficient methods of processing multiple client requests utilize the server's resources fully. Many computers, even though they are limited by having only a single Central Processing Unit (CPU), can perform specific multiple independent tasks simultaneously, such as reading data from the disk and transmitting data over a network connection. This is enabled by buffers designed to speed up the dialogue with the peripherals, such as network cards, disks and video display cards.
Modem operating systems support multi-tasking, which is the ability to run many separate applications at the same time. A single software program can take advantage of multi-tasking by creating multiple concurrent “threads.” Each thread simulates a separate application. Thus an HTTP server, for example, can use multiple threads to optimize its performance in responding to concurrent requests. Each request can be processed in a separate thread, and while one request is being processed in the CPU, a second request can be transmitted through the network hardware. If only one thread is used, then although the network hardware is buffered, the processing of the second request can become blocked—because the single thread waits for the network to finish sending.
Another advantage of using multiple threads to handle multiple requests is fairness. For example, suppose two clients simultaneously request information, one requests a large quantity of data while another requests a small amount. If their requests were to be handled within a single thread, the second client would have to wait until the completion of the first request. Multiple threads enable both clients to receive data concurrently.
The use of multiple threads in server applications works best when each request is independent of the other, in the sense that, for the most part, they do not share the same host computer resources. However, some sharing is inevitable. When threads share resources, this may produce bottlenecks in the transmission of data, CPU usage and access to peripherals. For example, using multiple threads creates problems such as page faults. Page faults occur when a thread requests a page of memory which is not readily available to the operating system. Thus server applications try to limit the number of concurrent threads, while optimizing performance and fairness.
One popular method for limiting the number of concurrent threads, implemented in the servers of two leading commercial server vendors—Microsoft Internet Information Services (IIS) and Netscape Enterprise, is as follows: A maximum number of threads is pre-set. Multiple threads are created as requests are received. In these servers, thread initiation is determined by client requests. These threads form a thread pool. The threads can be in either of two states—a wait state, in which they are idle and wait for a request to process, or a busy state, in which they are processing a request. At a time that the maximum limit of allowable concurrent requests/threads is reached and they are all busy, subsequent requests are queued pending availability of a free thread.
Added functionality for server-side applications may be provided via a secondary protocol, such as Common Gateway Interface (CGI). A reference on CGI may be accessed at:http://www.pricecostco.com/exchange/irf/cgi-spec.html.Reference is also made to Dwight, J., Erwin, Michael and Niles, Robert, Using CGI, Second Edition, Que Corporation, Indianapolis, Ind. 1997.
Unlike file retrieval governed by HTML, added functionality may involve intensive processing. The introduction of many server-side applications introduces severe performance problems. As mentioned above, multiple threads work best when each request is independent of each other. Otherwise, if two requests use a common resource, e.g., the same database, a situation may occur in which one request needs to wait until processing of the other request is complete and the resource is freed up. Currently webmasters, who are administrators of servers on the Web, manually tune a number of parameters to increase performance of their Web servers. A description of fine-tuning thread administration may be accessed at:http:/www.asia.microsoft.com/isn/techcenter/tuningIIS.htm.In the above-referenced description there appears a discussion of how to tune two parameters, ProcessorThreadMax and RequestQueueMax. ProcessorThreadMax sets a maximum limit on the number of concurrent threads per processor, and RequestQueueMax sets a maximum limit on the size of the request queue. Reference is also made to Microsoft Windows NT Server Resource Kit Version 4.0, Supplement One, Microsoft Press, Redmond, Wash. 1997, Chap. 6 entitled “Preventing Processor Bottlenecks.” This reference also discusses limiting the maximum number of threads and limiting the maximum number of connections. Quoting from page 136, “If you cannot upgrade or add processors. consider reducing the maximum number of connections that each IIS service accepts. Limiting connections might result in connections being blocked or rejected, but it helps ensure that accepted connections are processed promptly. . . . In extreme cases of very active or underused processors, you might want to adjust the maximum number of threads in the Inetinfo process.”
It may thus be appreciated that while a simple thread model performs adequately for an HTTP server that serves HTML documents, this same model may fail for CPU intensive server-side scripts.
Users desiring to access image information over the Web run up against the bandwidth problem. Whereas even complex text information, such as multiple fonts and scales, can be represented with a few dozen bytes of HTML, images are usually orders of magnitude larger. Images are typically represented in one of two standard formats, JPEG or GIF. Currently, to transmit a small photograph over the Web, such as a 2″×3″ photograph scanned in at 300 dots per inch (dpi) and compressed using JPEG, takes more than two minutes over a typical 1 KByte per second connection. This makes the viewing of quality images, such as a small museum painting 10″×10″ scanned in at 300 dpi and JPEG compressed, very frustrating.
A recently developed protocol, the Internet Imaging Protocol (IIP), was designed specifically for mitigating the bandwidth problem. It exploits the fact that the client cannot view more than a computer screen of image data at any one time. Even if the full image is enormous, such as 15,000×15,000 pixels, the client never views more than the screen resolution, usually less than 1,024×1,024 pixels, at any given time. Thus it is unnecessary to transmit more than a screen-ful of data, for any specific view of the image. IIP specifies a method for a client to request portions of an image at a specific resolution. A reference for IIP is the document “Internet Imaging Protocol,” ©1997 Hewlett Packard Company, Live Picture, Inc., and Eastman Kodak Company, the contents of which are hereby incorporated by reference.
A server with server-side software that supports IIP is referred to as an “image server.” Currently there are two popularly accepted ways to request image data from an image server using IIP; namely, using server-side processing of the request primarily, or using client-side processing primarily. Client-side processing is not relevant to the present invention, and will not be described herein.
To illustrate server-side processing, suppose a client with a viewing window of 640×480 pixels desires to view an image whose full resolution is 15,000×15,000 pixels. The client is unable to view the full image at its original resolution, and can either view the complete image at a low resolution, or view only a portion of the image at a high resolution. Usually the user prefers to begin with an initial view showing the full image at a low resolution, and then to interactively navigate by zooming, i.e., increasing the resolution while decreasing the “field of view,” or the portion of the image being viewed, and panning, i.e. translating the current view.
Under IIP, the full image at a 640×480 pixel resolution for an initial view can be requested using the following set of IIP commands:fif=<image name>&wid=640&hei=480&cvt=jpegThis request specifies the desired image by means of the f if command, and specifies the width and height of the client viewing window by means of the wid and hei commands, respectively. The last command, cvt, specifies the format of the image to be returned. As mentioned hereinabove, JPEG is supported by most browsers.
For the image server to process the above IIP request, the server must analyze the original image and generate an image matching the requested specifications, specifically the desired portion and dimensions. The analysis and generation are usually computationally expensive. In the example under discussion, a 15,000×15,000 pixel image would have to be re-sized, requiring approximately 675 MBytes to process.
To facilitate this processing, images can be stored in a pre-computed multi-resolution tiled format. Multi-resolution tiled images are constructed by first creating multiple copies of the image at different resolutions. Moreover, at each resolution the image is partitioned into a collection of disjoint tiles.
FLASHPIX®, a registered trademark of the Digital Imaging Group (DIG), is an example of a multi-resolution tiled image format A FlashPix image is constructed by starting with an original image and recursively sub-sampling it at half the resolution. The recursion continues until the final sub-sampled image is reduced to 64 pixels or less in each dimension. Each resolution level is partitioned into tiles that are 64×64 pixels in size. A reference for FLASHPIX® is a document “FlashPix Format Specification,” ©1996, 1997, Eastman Kodak Company, the contents of which are hereby incorporated by reference.
Referring to the abovementioned example, for a FlashPix image server to respond with an image at 640×480 pixel resolution that contains the full original image which is sub-sampled, is simply a matter of selecting an appropriate close pre-computed resolution. Using the numbers in the example, the successive resolutions are 15,000×15,000 pixels, then 7,500×7,500 pixels, then 3,750×3,750 pixels, then 1,875×1,875 pixels, then 937×937 pixels, etc. For example, the image server can choose resolution level 937×937 pixels and re-sample to 640×480 pixels. This is far better than working with a 15,000×15,000 pixel image.
FlasliPix images are more complicated than simple raster images. The individual 64×64 tiles into which each resolution is partitioned are usually JPEG compressed for Internet applications. Furthermore, the FlashPix format specification requires that the tiles be stored in a storage within a Microsoft OLE structured storage file. Structured storage files are compound files composed of multiple storages and streams, where storages are analogous to folders/directories and streams are analogous to files. Although there is overhead in accessing information inside a structured storage file, such files provide a clean interface for a complicated file structure. Structured storage is discussed in Appendix A of the above-referenced FlashPix Format Specification.
For a FlashPix image server to serve an image portion to a client using server-side processing such as the IIP request mentioned above, the following operations are required:    1. Accept the client connection            i. Wait in accept state for a connection        ii. Transmit TCP/IP data        iii. Hand off request to a second thread        iv. Go back into accept state waiting connection            2. Parse the request            i. Check the syntax of the request        ii. Format the request into an internal representation            3. Determine the requested tiles            i. Determine resolution level        ii. Determine portion        iii. Determine if re-sampling is necessary            4. Read the tiles from the file            i. Open the requested structured storage file        ii. Locate the appropriate storage for the previously determined level of resolution        iii. Read the meta information (size of tile, location, format, etc.)        iv. Read the tile data from a stream            5. Decompress the tiles            i. Apply Huffman decoding        ii. Apply the Inverse Discrete Cosine Transform        iii. De-quantize the coefficients        iv. Convert from YUV(4:1:1) to RGB(8:8:8)            6. Stitch the tiles together to form a contiguous image            i. Locate positions in memory        ii. Copy memory with an appropriate offset        iii. Re-size image to desired dimensions            7. Convert (re-compress) the image into the requested format            i. Convert from RGB(8:8:8) to YUV(4:1:1)        ii. Apply the Discrete Cosine Transform        iii. Quantize        iv. Apply Huffman encoding            8. Transmit data back to client            i. Transmit TCP/IP data        
Assignee's U.S. patent application Ser. No. 08/979,220 filed Nov. 26, 1997, now U.S. Pat. No. 6,121,970 issued Sep. 19, 2000, and entitled A METHOD AND SYSTEM FOR HTML-DRIVEN INTERACTIVE IMAGE CLIENT, the disclosure of which is hereby incorporated by reference, describes a way to view FlashPix images using the IIP cvt command, without the need for a plug-in or a Java applet. Each interactive user navigation command is implemented through a dynamic HTML page containing a cvt command with appropriate parameters. This puts a maximum strain on the server, as nearly all processing is done on the server side.
The invention described in the aforesaid U.S. patent application Ser. No. 08/979,220 filed Nov. 26, 1997, now U.S. Pat. No. 6,121,970 issued Sep. 19, 2000, and entitled A METHOD AND SYSTEM FOR HTML-DRIVEN INTERACTIVE IMAGE CLIENT utilizes caching to alleviate the computing burden, so that processed data is immediately available for repeated requests from one or many clients for the same image portion. However, when many images are available on the server, each with many image portions that can be viewed, the server must be able to process many demanding cvt commands that each require the extensive processing delineated above, along with many simple requests that can be satisfied immediately with data in cache, or with very little processing and with requisite fairness.