1. Field of the Invention
The present invention relates generally to image processing, and specifically to production of images and audio in a personal computer environment.
2. Discussion of the Prior Art
An important issue in digital technology is providing video images on a personal computer. These images are transmitted across the Internet and other networks, across telephone lines with modem-to-to modem connections, or received from compact disk read-only memories (CD-ROMs). The speed of a modem is commonly the limiting factor in sending real time, continuous video information across the Internet, over corporate intranets or local area networks. In comparison, continuous network transmission of audio data does not present significant difficulties.
Table 1 shows theoretical bandwidth maxima for various network architectures. Modem-to-modem connections across lines in plain old telephone service (POTS) have a theoretical bandwidth of 3,360 bytes per second, while connections across the Internet with a modem or single ISDN are limited to 5,600 bytes per second. Dual ISDN network architectures transmit a maximum of 11,200 bytes per second, while corporate local area networks with 10 BaseT connections have a capability of transmitting one megabyte per second. With the exception of telephone line connections, these other techniques involve non-continuous, packet-switched data. Satellite and cable architectures are also possible, but have not yet been widely adopted and present other difficulties.
On the other hand, computer memories and processor speeds have made rapid advances. Personal computers have hard drives accommodating many gigabits of data, and the price of memory chips is decreasing. Processor speeds approaching 300 MHz are available, and speeds of several GHz are contemplated.
To view a still or motion picture from the Internet on a personal computer, a user conventionally downloads video data from a web site by clicking on a web link. Often, however, it is necessary to separately download (or otherwise obtain) software, e.g. Adobe Acrobat, in order to display a particular image format. Images are frequently compressed for transmission over networks or storage on disks. Compression algorithms, such as JPEG and MPEG, using discrete cosine transfer (DCT) methods, produce serviceable images but compromise image size, image quality, definition, and acquisition speed. Image latency is also sacrificed. A user must wait while an entire image or series of images is buffered in a client side personal computer prior to display. Image transmission is sometimes interrupted due to network errors and traffic. Streaming techniques allow a user to begin viewing the images immediately while downloading, but streaming still sacrifices image quality and latency.
Currently, International Telecommunications Union Standard ITU-R 601 for digital formats in professional video production (i.e. NTSC) requires 720 by 486 pixels per frame in the scanned image, and an eight-bit 4:2:2 sampling of Y, R-Y, B-Y color components at sixty frames per second. This results in a data stream of 20 megabytes per second if the format is to remain uncompressed and if the images are to be viewed continuously in real time. Clearly, this is greater than the fastest rate for 10 BaseT of one megabyte per second. A compression ratio of 5:1 is the most that is considered desirable for production marketplace image quality, but this only reduces the necessary data rate to 4 megabytes per second. Using 4:1:1 sampling, other conventional digital video production techniques (e.g. DVC Pro and DV Cam) produce a marginally improved data rate of 3 megabytes per second. Compression ratios of 30:1 are sometimes used for previewing and editing of video images, but this only yields a data rate of 700 kilobytes per second. Data rates for these formats are summarized in Table 2.
Comparing this to the standard modem of 56 kilobytes per second, there is a readily apparent, significant gap between requirements for ITU-R 601 and present-day hardware transmission capabilities. A further compression ratio of 125:1 on an already-compressed and marginally acceptable 30:1 compressed image, i.e. a total compression of 750:1, is needed to transmit ITU-R 601 data across a 56 k modem.
Present methods of displaying moving objects on web pages involve either bit-mapped or vector approaches. Simple moving icons on a web page are produced by changing only part of the image in every frame. For example, Microsoft(copyright) and Netscape(copyright) browsers show moving traces around their logos while a processor is retrieving a page. Advertisements on web pages also display moving images. The bandwidth for these images is reduced by making the images smaller so that fewer bits are needed for each frame, or by slowing down the frame rate so that the images appear to move discontinuously.
High definition television (HDTV) attempts to simplify the display of video images and reduce bandwidth by recognizing constant areas within a video picture and retaining much of the information from a previous frame. While HDTV developed concurrently with MPEG and JPEG, HDTV is broadcast-oriented and does not lend itself to network transmission or personal computer applications.
It is expected that bandwidth will continue to be the bottleneck in network transmission for the foreseeable future. Thus, there is an outstanding need in the prior art to be able to send professional quality video images across networks through ordinary modems by taking advantage of plenary memory and processor capacities within personal computers, and thereby reducing reliance on transmission hardware. There is also a need to create compelling new video experiences in personal computers.
The present invention is concerned with client-side production in a personal computer environment of low bandwidth images and audio. A series of still images in an image module along with a xe2x80x9cscriptxe2x80x9d module and an audio module are sent over a network in a client/server architecture or are read from a compact disk or other memory. A xe2x80x9cdirectorxe2x80x9d module residing in memory (e.g. on hard disk) of the client personal computer uses the xe2x80x9cscriptxe2x80x9d to tell the computer how to execute a sequence of xe2x80x9cmovesxe2x80x9d on the still images. These moves include, but are not limited to, cuts, dissolves, fades, wipes, focuses, flying image planes, and digital video effects such as push and pull. The director module is either downloaded from a network on a one-time basis or uploaded from a floppy or compact disk.
Production sequences are in real time, as well as being relatively smooth and continuous as compared to prior art network video. In order to permit viewing as soon as possible and to avoid caching, the script module is transmitted to the personal computer along with preliminary images, so playback begins immediately. Low bandwidth is achieved because a majority of the production is done at the client location and the transmission of still pictures, audio data and script is relatively rapid. Images are always displayed in real time and in full screen formats. If necessary to prevent latency delays, the director modules inserts stand-in from stock footage, animation and loops so that a viewer always has a continuous visual and audio experience.