The present invention relates to a general purpose system architectural method for multimedia communications. The object of this invention is to improve the quality and efficiency for human communications. Our architectural method allow for the access of a plurality of computing, consumer, and communication equipment, e.g., PC an workstations, camera, television, VCR, telephone, etc., and allow for conveying multiple types of media information, e.g., sound, image, animated graphics, and live video. Despite of the real-time constraints and resource limitation to store, retrieve, and exchange these massive media data information, an efficient architectural method was invented to make multimedia communications system a final reality.
This invention is dedicated to the specific application of teleconferencing. However, orientation of the system to different class of tasks involves no significant redesign, but primarily involves changes on the host computer programs, system hardware, and communications subsystems.
This invention relates to a general purpose architectural method suitable for most conceivable combinations for multimedia communications. PC workstations are widely available at most offices and homes today, yet due to their processing and storage limitations, they are never considered for complex image/live video applications. Alternatively, existing methods employee single media communications. Namely, telephone for human voice communications, fax for text communications, or PC workstations for data communications. Noticeably all of these single-media communications use existing analog telephone lines connecting through the central office (CO) switch, only one of the media types can be selected at a time, and the fax and F20 use dial-up modem for analog transmission of the digital data. Meanwhile, various coding techniques are available today so that source media (image, live video, sound, and animated graphics) can be reduced (coded or compressed) into lesser quantity to ease the storage and transmission constraint, and the destination media can be restored (decoded or decompressed) and playback without quality degradation, then such digital coded media information can find wide applications for remote database retrieval, teleconferencing, messaging, distance education and other applications to complement traditional single media (voice, data, and text) communications.
We now turn to the reviewing of existing product and patent. Various single-media codec (compression and decompression) techniques has matured in recent years to allow the high reduction (compression) of the source media and the quality playback (decompression) of the destination media. Individual international standards (CCITT and ISO) will soon be established to facilitate the worldwide communications of still image, quality sound, live video, and animated graphics. However the multimedia products we have searched to-date are either video conferencing systems (i.e. CLI, PictureTel) using dedicated systems and complex algorithms for quality video and audio only, or incorporate desktop PC workstation for a one-way, decode only (playback and display) mixed media presentation (DVI, CDI et.al). Videophones (Sony, Panasonic, et.al.) have been the only communications product which utilize real-time coder and decoder for image and voice transmission through traditional analog or digital transmission, However, their quality are poor, and effects are limited. In conclusion, the prior arts involve either real-time playback of the precoded compressed data (live video, sound, and graphics) for a multimedia presentation, or the real time coding and decoding of live video and voice for a live conferencing applications.
Accordingly, we feel it is superior to provide digital media communications in conjunction with the traditional voice and data communications because it combines the use of live video, graphics, and audio media, therefore make up a much more effective means for human to communicate with each other. Since xe2x80x9csingle picture worths a thousand wordsxe2x80x9d, it is conceivable that pictorial information such as image and live video can definitely enhance and complement the traditional communications.
An object of the present invention is to allow for PC/WS (PC or workstation) as a single platform technology and to define an integrated architectural method which accommodate communications (remote transmission and retrieval) for all types of digital coded (compressed) multiple-media information.
Another object of the present invention is to provide a flexible architecture which allow for management and control of the variable communications bandwidth and address the flexible combinations of the digital coded mutiple-media information for a wide variety of application requirements. Some of the applications examples are distance education (teaching and learning), teleconferencing, messaging, videophone, video games, cable TV decoders, and HDTV.
Still another object of the present invention is the application of digital coding techniques for reducing the storage and transmission requirements for multiple media information, we also suggest the conversion of digital compressed media to analog form for convenient interface with the traditional analog storage or transmission techniques.
Still another object of the present invention is the combinatorial use of animated graphics and motion estimation/compensation for regeneration of the live video. Namely, animated graphics techniques will be applied for the playback of estimated motion effects.
Still another object of the present invention is the interactive use of multiple media types. Namely, the user has the control to program and select the appropriate media combination for specific application needs either before or during the communications session. For examples, the user can decide to select the live video with voice quality audio before the session starts, but during the session, he can choose instead to use the high quality audio with slow motion and still freeze pictures for more effective communications.
Still another object of the present invention is to leverage with all of the available international standard codec technologies, and evolve into a human interactive communications model, and conclude with a low cost, high quality, highly secured, interactive, yet flexible, and user friendly method for desktop, handheld, or embedded media communications.
Still another object of the present invention is to provide cost effective method for transmission bandwidth and local storage. Coding techniques have been used to conserve storage and transmission bandwidth since the media information data can be greatly reduced. These coded information still preserve the original quality and allow for presentation at selective quality levels at users request. Since these information are coded according to selective algorithms, without the corresponding decoder, information can not be properly decoded and used, this allow for high degree of security for special applications.
Still another object of the present invention is to provide implementation for selecting one of a plurality of multiple quality levels for live video, graphics, audio, and voice. Depending on the application requirement, user can select the appropriate media quality as desired. For example, high quality audio and high quality image and graphics may be suitable for collage education, voice combine with live video will be suitable for K-12 education, face to face video and voice will be effective for business negotiations.
Still another object of the present invention is to conserve transmission bandwidth, still image can be blended with locally generated live background video or animated graphics. User can instaneously adjust the quality levels during the sessions to make the meeting or presentation more effective.
The significant difference between our process and the traditional video conferencing is that only photo images of the conferees (talking heads) have been shown on a traditional video conferencing/videophone setup. In our method, the conferees are allowed to substitute the conferee photo images with other important pictorial information retrievable form the database and present (broadcast) to others for better illustrations. The conferees also have the control to select the appropriate quality level that he or she wants in order to conserve bandwidth. As an example, for a product presentation, it is better to provide coarse quality live video with high fidelity audio as a introduction. Once specific interests are generated, fine quality video without audio can be presented to facilitate further discussions. The other example is an international meeting while different languages are used, live video can always make ease the verbal explanation, and quality audio can harmonize the atmosphere during tense moments. To further conserve the bandwidth, live coarse video can overlay with locally generated fine quality still background image to provide acceptable video presentation (Notice that the fine quality video will be locally generated therefore doesn""t consume any communications bandwidth). Finally since all coded multimedia information will require proper decoder to expand back to the original presentable forms, therefore it is highly secured, furthermore, different security level can be assigned to each conferee, therefore appropriate information will only be shown to various audience without any concerns on security.
Finally, television only facilitate an traditional analog video and audio session, since it is one-way non-interactive communication, receiver can only observe and listen, they can not make comments or edit (remark) a media message, not to mention the ability to control (select and edit) the appropriate media massage and return to the sender. These interactive capabilities will be extremely beneficial for distance learning, or remote classroom applications.