During the 1970's and 1980's, the defense industry encouraged and developed an interconnecting network of computers as a back up for transmitting data and messages in the event that established traditional methods of communication fails. University mainframe computers were networked in the original configurations, with many other sources being added as computers became cheaper and more prevalent. With a loose interconnection of computers hardwired or telephonically connected across the country, the defense experts reasoned that many alternative paths for message transmission would exist at any given time. In the event that one message path was lost, an alternative message path could be established and utilized in its place. Hence, it was the organized and non-centralized qualities of this communications system that made it appealing to the military as a backup communication medium. If any one computer or set of computers was attacked or disconnected, many other alternative paths could eventually be found and established.
This interconnection of computers has since been developed by universities and businesses into a worldwide network that is presently known as the Internet. The Internet, as configured today, is a publicly accessible digital data transmission network that is primarily composed of terrestrial communications facilities. Access to this worldwide network is relatively low cost and hence, it has become increasingly popular for such tasks as electronic mailing and Web page browsing. Both such functions are badge or file transfer oriented. Electronic mail, for instance, allows a user to compose a letter and transmit it over the Internet to an electronic destination. For one-way Internet transfers such as e-mail, it is relatively unimportant how long each file transfer takes as long as it is reasonable. Messages are routed, through no fixed path but rather, through various interconnected computers until they reached their destination. During heavy message low periods, messages will be held at various internal network computers until the pathways are cleared for new transactions. Accordingly Internet transmissions are effective for one-way transfers, but cannot be relied upon for time high priority applications.
Web pages are collections of data including text, audio, video, and interlaced computer programs. Each web page has a specific electronic site destination that can be accessed through a device known as a web server, and can be accessed by anyone through via Internet. Web page browsing allows a person to inspect the contents of a web page on a remote server to glean various information contained therein, including for instance product data, company backgrounds, and other such information which can be digitized. The remote server data is access by a local browser, and the information is displayed as text, graphics, audio, and video.
The web browsing process, therefore, is a two-way data communication between the browsing user, who has a specific electronic address or destination, and the web page, which also has a specific electronic destination. In this mode of operation, as opposed to electronic mail functions, responsiveness of the network is paramount since the user expects a quick response to each digital request. As such, each browsing user establishes a two-way data communication, which ties up an entire segment of bandwidth on the Internet system.
Recent developments on the Internet include telephone, videophone, conferencing and broadcasting applications. Each of these technologies places a similar real-time demand on the Internet. Real-time Internet communication involves a constant two-way throughput of data between the users, and the data must be received by each user nearly immediately after its transmission by the other user. However, the original design of the Internet did not anticipate such real-time data transmission requirements. As such, these new applications have serious technical hurdles to overcome in order to become viable.
Products, which place real-time demands on the Internet, will be aided by the introduction of an updated hardware interconnection configuration, or “backbone,” which provides wider bandwidth transmission capabilities. For instance, the MCI backbone was recently upgraded to 622 megabytes per second. Regardless of such increased bandwidth, the interconnection configuration is comprised of various routers, which may still not be fast enough, and can therefore significantly degrade the overall end-to-end performance of both one-way, and particularly two-way, traffic on the Internet. Moreover, even with a bandwidth capability of 622 megabytes per second, the Internet backbone can maximally carry only the following amounts of data: 414—1.5 Mbs data streams; 4,859—128 Kbs data streams; 21597—28.8 Kbs data streams; or combinations thereof. While this is anticipated as being sufficient by various Internet providers, it is likely to quickly prove inadequate for near-future applications.
Internal networks, or Intranet sites, might also be used for data transfer and utilize the same technology as the Internet. Intranets, however, are privately owned and operated and are not accessible by the general public. Message and data traffic in such private networks is generally much lower than more crowded public networks. Intranets are typically much more expensive for connect time, and therefore any related increase in throughput comes at a significantly higher price to the user.
To maximize accessibility of certain data, broadcasts of radio shows, sporting events, and the like are currently provided via Internet connections whereby the broadcast is accessible through a specific web page connection. However, as detailed above, each web page connection requires a high throughput two-way connection through the standard Internet architecture. A given Internet backbone will be quickly overburdened with users if the entire set of potential broadcasters across world began to provide broadcast services via such web page connections. Such broadcast methods through the Internet have thereby proven to be ineffective given the two-way data throughput needed to access web pages and real-time data.
There is an enormous demand for the delivery of large amounts of content to a large number of listeners. The broadcast channels of today, such as radio and TV, can only deliver a small number of channels to a large number of listeners. Their delivery mechanism is well known to customers. The broadcaster transmits programs and the listener must “tune in” at the proper time and channel to receive the desired show.
For example, “on demand” systems have been attempted by the cable industry. Such systems attempt to transport the program or show from a central repository (server) to the user (client) in response to his/her request. To initiate the request, the user selects from a list of candidate programs and requests that the system deliver the selected program.
The foregoing “on demand” model of content delivery places two significant requirements on the delivery system. First, there should be a direct connection between each content storage device (server) and each listener (client). The phone system is an example of such a point-to-point interconnection system. Another example of such an interconnection system is the Internet, which is also largely based on the terrestrial telecommunications networks. Second, the server must be capable of delivering all the programs to the requesting clients at the time that which the client demands the programming.
With the advent of digital video products and services, such as Digital Satellite Service (DSS) and storage and retrieval of video streams on the Internet and, in particular, the World Wide Web, digital video signals are becoming ever present and drawing more attention in the marketplace. Because of limitations in digital signal storage capacity and in network and broadcast bandwidth limitations, compression of digital video signals has become paramount for digital video storage and transmission. As a result, many standards for compression and encoding of digital video signals have been promulgated. For example, the International Telecommunication Union (ITU) has promulgated the H.261 and H.263 standards for digital video encoding. Additionally, the International Standards Organization (ISO) has promulgated the Motion Picture Experts Group (MPEG), MPEG-1, and MPEG-2 standards for digital video encoding.
These standards specify with particularity the form of encoded digital video signals and how such signals are to be decoded for presentation to a viewer. However, significant discretion is left as to how the digital video signals are to be transformed from a native, uncompressed format to the specified encoded format. As a result, many different digital video signal encoders currently exist and many approaches are used to encode digital video signals with varying degrees of compression achieved.
In general, greater degrees of compression are achieved at the expense of video image signal loss and higher quality motion video signals are achieved at the expense of lesser degrees of compression and thus at the expense of greater bandwidth requirements. It is particularly difficult to balance image quality with available bandwidth when delivery bandwidth is limited. Such is the case in real-time motion video signal delivery such as video telephone applications and motion video on demand delivery systems. It is generally desirable to maximize the quality of the motion video signal as encoded without exceeding the available bandwidth of the transmission medium carrying the encoded motion video signal. If the available bandwidth is exceeded, some or all of the sequence of video images are lost and, therefore, so is the integrity of the motion video signal. If an encoded motion video signal errs on the side of conserving transmission medium bandwidth, the quality of the motion video image can be compromised significantly.
The format of H.263 encoded digital video signals is known and is described more completely in “ITU-T H.263: Line Transmission of Non-Telephone Signals, Video Coding for Low Bitrate Communication” (hereinafter “ITU-T Recommendation H.263”), incorporated by reference herein in its entirety. Briefly, in H.263 and other encoded video signal standards, a digital motion video image signal, which is sometimes called a video stream, is organized hierarchically into groups of pictures, which include one or more frames, each of which represents a single image of a sequence of images of the video stream. Each frame includes a number of macroblocks that define respective portions of the video image of the frame. An I-frame is encoded independently of all other frames and therefore represents an image of the sequence of images of the video stream without reference to other frames. P-frames are motion-compensated frames and are therefore encoded in a manner that is dependent upon other frames. Specifically, a P-frame is a predictively motion-compensated frame and depends only upon one I-frame or, alternatively, another P-frame which precedes the P-frame in the sequence of frames of the video image. The H.263 standard also describes BP-frames, however, for the purposes of description herein, a BP-frame is treated as a P-frame.
All frames are compressed by reducing redundancy of image data within a single frame. Motion-compensated frames are further compressed by reducing redundancy of image data within a sequence of frames. Since a motion video signal includes a sequence of images, which differ from one another only incrementally, significant compression can be realized by encoding a number of frames as motion-compensated frames, i.e., as P-frames. However, errors from noise introduced into the motion video signal or artifacts from encoding of the motion video signal can be perpetuated from one P-frame to the next and therefore persist as a rather annoying artifact of the rendered motion video image. It is therefore desirable to periodically send an I-frame to eliminate any such errors or artifacts. Conversely, I-frames require many times more bandwidth, e.g., on the order of ten times more bandwidth, than P-frames, so encoding I-frames too frequently consumes more bandwidth than necessary. Accordingly, determining when to include an I-frame, rather than a P-frame, in an encoded video stream is an important consideration when maximizing video image quality without exceeding available bandwidth.
Another important consideration when maximizing video image quality within limited signal bandwidth is the compromise between image quality of and bandwidth consumed by the encoded video signal as represented by an encoding parameter λ. In encoding a video signal, a particular value of encoding parameter λ is selected as a representation of a specific compromise between image detail and the degree of compression achieved. In general, a greater degree of compression is achieved by sacrificing image detail, and image detail is enhanced by sacrificing the degree of achievable compression of the video signal. In the encoding standard H.263, a quantization parameter Q effects such a compromise between image quality and consumed bandwidth by controlling a quantization step size during quantization in an encoding process.
However, a particular value of encoding parameter λ that is appropriate for one motion video signal can be entirely inappropriate for a different motion video signal. For example, motion video signals representing a video image which changes only slightly over time, such as a news broadcast (generally referred to as “talking heads”), can be represented by relatively small P-Patent frames since successive frames differ relatively little. As a result, each frame can include greater detail at the expense of less compression of each frame. Conversely, motion video signals representing a video image that changes significantly over time, such as fast motion sporting events, require larger P-frames since successive frames differ considerably. Accordingly, each frame requires greater compression at the expense of image detail.
Determining an optimum value of encoding parameter λ for a particular motion video signal can be particularly difficult. Such is especially true for some motion video signals, which include both periods of little motion and periods of significant motion. For example, in a motion video signal representing a football game includes periods where both teams are stationary awaiting the snap of the football from the center to the quarterback and periods of sudden extreme motion. Selecting a value of encoding parameter λ which is too high results in sufficient compression that frames are not lost during high motion periods but also in unnecessarily poor image quality during periods were players are stationary or moving slowly between plays.
Conversely, selecting a value of encoding parameter λ that is too low results in better image quality during periods of low motion but likely results in loss of frames due to exceeded available bandwidth during high motion periods.
A third factor in selecting a balance between motion video image quality and conserving available bandwidth is the frame rate of the motion video signal. A higher frame rate, i.e., more frames per second, provides an appearance of smoother motion and a higher quality video image. At the same time, sending more frames in a given period of time consumes more of the available bandwidth. Conversely, a lower frame rate, i.e., fewer frames per second, consumes less of the available bandwidth but provides a motion video signal which is more difficult for the viewer to perceive as motion between frames and, below some threshold, the motion video image is perceived as a “slide show,” i.e., a sequence of discrete, stilted, photographic images. However, intermittent loss of frames resulting from exceeding the available threshold as a result of using an excessively high frame rate provides a “jerky” motion video image which is more annoying to viewers than a regular, albeit low, frame rate.
I-frame placement and encoding parameter λ value selection combine to represent a compromise between motion video image quality and conservation of available bandwidth. However, to date, conventional motion video encoders have failed to provide satisfactory motion video image quality within the available bandwidth.
These shortcomings discussed above have significantly reduced the quality and effectiveness of audio/visual Internet transmissions and prevented the widespread application of these technologies. Accordingly, it would be desirable to have systems and methods that allow for high quality two-way transmission of audio and video signals while minimizing bandwidth usage.
Presently, many shortcomings are apparent with current video-conferencing technologies such as Microsoft's Net Meeting®. This program requires the consumer to go to a site, download the software that has an estimated time of one hour at average connect speed via analog modem. Then one must follow a series of steps while having to double click and provide technical information about their system, about their method of connection, and where they wish to connect. Upon connecting, they will have to establish a room, share that room's address and password/user names with the conference, and then engage in the conferencing. This is all assuming that the two can coordinate their effort using the same platform.
If successful, at best the conference has a mediocre to low quality video and almost irritating, unfiltered audio, with echoing tendencies that is limited to 1 viewer and 1 producer. Moreover, the software is limited solely to video-conferencing use. In addition, if the consumer has no microphone/camera, they cannot utilize the software. Finally, the session occurs with no regulation or control/direction.
Another area that has suffered as a result of the shortcomings inherent in present audio and visual transmission technologies is the field of online education. Presently, only a few universities are using online education in a limited capacity. Duke University, for example has Masters in Business Administration (MBA) Program that is exclusively online. The University of Phoenix also is using online education. However, at the moment, previously taped “non-interactive” video lectures are all that can be viewed by the students. Homework assignments can be downloaded of the school website, prepared by the student and then emailed to the professor. Students can also enter chartrooms and ask questions from their professors.
Unfortunately, this is the only method by which online education exists today. Although there are many benefits to on line education and the institutions implementing this current system have had a great response, they are obsolete because of the lack of student/teacher interactions.
Such a technological improvement would also prove advantageous in the field of airline security. The FAA, following the terrorist attacks of Sep. 11, 2001 has requested for security methods or systems that would enable viewing the cockpit and the interior of the fuselage of an aircraft in “real time” by multiple government agencies simultaneously.
From the above, it can be seen that there is a great need for a high quality and high speed means of providing interactive audio and video transmission between remote locations. As explained below, the present invention solves this need as well as other shortcomings of prior systems.