This invention is directed to a multimedia conferencing system, and is particularly directed to a central processing hubxe2x80x94sometimes referred to as a digital media forumxe2x80x94whose purpose is to receive video and audio data signals from remote user terminals, process them in one way or another, and to deliver composite video and audio signals to the remote user terminals. The criteria which govern the nature of the composite video and audio signals which are delivered to any remote user terminal may vary from one remote user terminal to another, and they may vary from time to time with respect to any remote user terminal.
A multimedia conferencing system is one which utilizes a variety of media types and sources, particularly utilizing live or real-time video and audio sources from a plurality of remote users. Those remote users may be geographically scattered very widely, ranging from being located in different areas of the same office building, to different cities, and even to different continents.
In order to maintain a multimedia conferencing system, a central processing hub is required, which must function as a multipoint control unit. This enables a plurality of participants to conduct a multi-party multimedia conference.
A multimedia conference will comprise at least two participants, and up to very many participants. The total number of participants in any particular multimedia conference is dynamically configurable, as discussed in detail hereafter, and is limited only by the particular hardware configuration being employed. However, one aspect of the present invention, as will be described hereafter, is the fact that the hardware configuration may be dynamically configurable. Moreover, so a plurality of substantially identical central processing hubs may be cascaded one to another, as described hereafter.
In keeping with a particular aspect of the present invention, each participant in any multimedia conference may utilize different video, audio, and data compression technology than any other participant, they may use different multimedia control protocols than any other participant, and they may even communicate within the dynamically configured multimedia conference using different transmission rates and different network protocols.
Accordingly, the present invention provides a platform upon which there may be established inter-operability between disparate multimedia network types, and inter-operability between different multimedia terminal types, along with multi-party multimedia communications.
As will be described in greater detail hereafter, the central processing hub of the present invention provides a multimedia platform which will support a family of products that meet the communication requirements noted above. At the core of the central processing hub, there is a fully redundant backplanexe2x80x94having regard to the system architecturexe2x80x94which provides high-speed media and packet buses. These buses allow for high-speed switching and interconnection with other central processing hubs as may be required. Connected to the media and packet buses are a plurality of line cards, again having regard to the system architecture, which provide for a variety of functions that are necessary in any multimedia conferencing system, including media processing, video encoding, shelf control, bus control, line interface requirements, and so on. Such architecture is described in greater detail hereafter, along with detailed discussion of various ones of the line cards which are employed.
When multimedia conferencing occurs, multiple remote sites can participate in live, real-time, multi-party multimedia conferences. With collage conferencing, a video collage is assembled at the central processing hub and forwarded or transmitted to the various participants in the then ongoing multimedia conference. As will be discussed hereafter, the video collage which is sent to various participants may differ from one participant or remote user site to another participant or remote user site. Generally, any multimedia conference is controlled by a so-called Session Manager. However, as will be noted hereafter, the session manager is not necessarily an individual person, it may be an intelligent network or a personal computer which operates in keeping with certain predetermined criteria to establish the nature of the video and audio signals which are delivered to the remote user terminals.
A multimedia conferencing system in keeping with the present invention, as described hereafter, will support many individual input streams, which may have varying speeds and protocols. Video pre-processing may be required, including scaling depending on the protocol being used. Video post-processing will include creation of a collage, whereby various video images may be placed in different positions, each video image having a controlled sizexe2x80x94which may vary from image to image within the collage, and which vary from time to time with respect to any and all images being presented.
Audio pre-processing may occur, including adjusting and controlling the volume for each participant. Also, audio post-processing may occur, particularly in such a manner as described hereafter whereby the audio signal which is sent to any participant will be processed in such a manner that they will not receive an audio signal containing his or her own audio input.
As indicated previously, and as will be discussed in greater detail hereafter, central processing hubs in keeping with the present invention are each such as to include a high-speed backplane which may be connected one to another so as to be cascaded. Moreover, a cascaded plurality of central processing hubs will function as if it were a single large-scale processing hub.
The present invention provides a multimedia conferencing system whereby a number of different providers, each of which may operate a proprietary network protocol or protocols, may be interlinked one with another through the central processing hub. Accordingly, the present invention will provide a platform for a conferencing system including a management node and a central processing hub by which gateway and multipoint control are provided. By providing appropriate functionality and management control software for the various functional units, line cards, and backplane circuitry included in a central processing hub in keeping with the present invention, the precise nature of the central processing hub in keeping with the present invention is essentially transparent or not noticeable to networksxe2x80x94including various service providers who may deliver multimedia conferencing video and audio data signals to the central processing hub. Thus, the various service providers may invest their resources in delivering video and audio content in keeping with their own transmission protocols, rather than having to satisfy specific input protocols as is generally the case in the industry prior to the present invention having been developed.
A typical patent which describes prior art video conferencing systems is LUKACS U.S. Pat. No. 5,737,011, which teaches a video conferencing system which is said to be infinitely expandable, and which is a real-time conferencing system. In this patent, each of the conference participants has the ability to customize their own individual display of other participants, using a chain of video composing modules which can be expanded so as to combine video signal streams from any number of conference participants in real time. Different media types may be associated through appropriate software and manipulated for multimedia uses. The Lukacs system is such as to allow each individual user to dynamically change who can receive the information that they provide to the conference.
ELY et al. U.S. Pat. No. 5,796,424 describes a system and method for providing video conferencing services where a broadband switch network, a broad-band session controller, and a broadband service control point are provided. Here, connections are provided between information senders and receivers in response to instructions from the broadband service control point or in response to requests which are originated by any remote information sender/receiver. The broadband service control point provides processing instructions and/or data to the broadband controller and to each remote sender/receiver. The system is particularly directed to video-on-demand utilization. Whenever a user requires a video from a video information provider, the broadband session controller establishes communication between the set top controller at the remote user""s location and the video information provider, requesting processing information from the broadband service control point in response to predetermined triggers. A broadband connection between a video information provider and a specific user is established under control of the broad-band session controller. If the system is to be used in video conferencing, the set top controller will control cameras, microphones, and so on. Telephone services may also be provided over the same integrated network.
The present invention provides a multimedia conferencing system and, in particular, a central processing hub therefor. The multimedia conferencing system comprises the central processing hub and a plurality of remote user terminals; and each of the remote user terminals at least comprises means for sending video data signals and audio data signals to the central processing hub, and means for receiving video data signals and audio data signals from the central processing hub. Under the scheme of the present invention, the central processing hub receives the video and audio data from each of the plurality of remote user terminals, processes the received video data and audio data, and returns a video data signal and an audio data signal to each of the remote user terminals which includes video data and audio data, respectively, from at least one of the plurality of remote user terminals.
The central processing hub comprises a media bus whose purpose is to handle video and audio data signals within the central processing hub. The media bus can accommodate real-time distribution of media types such as compressed or uncompressed digital video data and audio data. A packet bus is also provided, whose purpose is to handle data and control signals within the central processing hub, where the data or control signals are sent in blocks or packets of data.
A shelf controller card is included in the central processing hub, for issuing control messages to control the operation of the central processing hub in keeping with incoming management signals which are delivered directly to the shelf controller card. A bus controller card is also provided so that at least clock signals and bus arbitration signals are generated and distributed within the central processing hub.
A further card included in the central processing hub is at least one physical line interface card, whose purpose is to provide the physical interface port or ports for the central processing hub. The physical interface card may also provide data link layer functions.
At least one media processor card is provided for processing video and audio data signals within the central processing hub. Thus, most of the multimedia processing for the media conferencing system is carried out in the media processor card. In the egress direction, the media processor card receives data from the physical line interface card, reassembles or defragments the data, demultiplexes it as necessary, decodes the data, pre-processes and bridges audio and video streams. In the ingress direction, the media processor card receives compressed video data in the form of transport packets from the video encoding means, compresses bridged audio, multiplexes the audio with the video, segments or fragments the data, and sends the resulting cells or frames to the physical line interface card.
Means are provided for video encoding, and the video encoding means receives video data from each of the at least one media processor card and delivers video data signals to the packet bus. The means for encoding may perform video post-processing, compress the video, encapsulate the compressed video into transport packets, and send the resulting packets via the packet bus to a media processor card.
Video and audio data signals received from the plurality of remote user terminals are received at the central processing hub by any one of the at least one media processor card or at least one physical interface card. The received video and audio data signals are passed via one of the media bus and the packet to the at least one media processor card for further processing. Signals which are delivered from the central processing hub to the plurality of remote user terminals are delivered from the central processing hub by any one of the at least media processor card and the at least one physical interface card. In keeping with the present invention, the means for sending and receiving video and audio data signals to one of the plurality of remote user terminals may differ from one remote user to another. Moreover, each of the plurality of remote user terminals may communicate with the central processing hub using a different communications protocol than any of the other remote user terminals. Thus, the central processing hub provides a gateway function whereby remote users can communicate across different network boundaries.
The video encoding means which is provided in the central processing hub may be a separate video encoder card, or it may be included in at least one of the media processor cards.
The shelf controller card further comprises means for communicating with a management node. Thus, management signals for the central processing hub can be delivered from the management node through the shelf controller to the central processing hub.
Any given multimedia conference, and the nature of the output video and audio data signals which are sent to the plurality of remote user terminals in that conference, is controlled by the session manager communicating through an input port on the management node. The session manager may be an intelligent network, it may be a personal computer, or the session manager may be an individual person who interacts with an intelligent network or a personal computer, and thence to the central processing hub through the management node.
The video and audio data signals which are received from and delivered to each of the plurality of remote user terminals are generally in the form of compressed signal packets. However, they may be in the form of analog signals which are passed to and from the central processing hub via analog ports on the at least one media processor card. In either case, compressed signal packets are delivered from the video encoding means to the packet bus, and bi-directionally between the packet bus and any one of the at least one interface card and the at least one media processor card. Uncompressed video and audio real-time signals are delivered uni-directionally between the media bus and any of the at least one media processor card and the video encoding means.
An important feature of the present invention is that the at least one media processing card includes a video link and a video router, so that data signals from any of the plurality of remote user terminals are summed within the media processing card, and the resultant summed video data signal is passed to a further video link via the video router. Thus, the returned video data signal from the central processing hub to the plurality of remote user terminals is derived from a cascade of video links.
Moreover, additional central processing hubs may be connected through the bus controller card so that the media bus and packet bus of the further central processing hubs are connected together. In this case, the interconnected central processor hubs share resources, control signals, clock signals, and bus arbitration signals. Moreover, the at least one video link on the at least one media processing card, of each of a plurality of similar central processing hubs, are cascaded one with respect to another.
In one embodiment of the present invention, the summed video data signal which is delivered to each of the remote user terminals is a common signal which is delivered to all of the remote user terminals. In another aspect of the present invention, the summed video data signal is under the control of the session manager, and each respective one of the summed video data signals will include video data from at least one other of the remote user terminalsxe2x80x94but not necessarily all of the other remote user terminals.
Even when the summed video data signal is a common signal delivered to all of the remote user terminals, it may also be under the control of the session manager, and thus the video signal in the summed signal which is representative of any remote user terminal may be changed by the session manager.
However, in most instances, the audio data signal which is delivered to each of the remote user terminals includes audio data from at least one other of the remote user terminals, but excludes audio data from the respective remote user terminal to which the summed audio signal is delivered.
The media bus will comprise at least one video bus and at least one audio bus, and generally there are a plurality of video buses and a plurality of audio buses operating in parallel one to another so as to provide for increased bandwidth. There may be at least two audio buses which are adapted to be operated in parallel so as to increase the bandwidth of the composite audio bus over the bandwidth of one audio bus; or, two separate audio buses may be operated so as to provide stereo audio signals.
In any event, there is generally a plurality of video buses and a plurality of audio buses included in the media bus, so that there is redundancy provided with respect to the video buses and audio buses. Moreover, there is generally a plurality of physical line interface cards and a plurality of media processor cards, so that there is redundancy provided with respect to the physical line interface cards and media processor cards.
Still further, additional physical line interface cards and additional media processor cards may be added to the central processing hub at any time, by being connected to the media bus and the packet bus. This provides for dynamic expansion of the central processing hub.
In keeping with the present invention, each of the respect cards included in the central processing hub performs pre-designated tasks in keeping with respective instruction sets which are in respective microprocessors on each respective card. Those tasks are also performed further in keeping with control signals which are delivered to each respective card over the packet bus.
Each media processor card will perform tasks such as signal decoding of video data and audio data received by the media processor card. Signal routing of the video and audio data, signal scaling of the video and-audio data, and time-base correction of the video and audio data received by the media processing card may also be carried out. The video data and audio data may be linked from one media processor card to another.
The video data signal which is received from any of the plurality of remote user terminals will include a video stream and it may also include other data such as graphics data, text data, or spread sheet data recovered from a computer at the respective remote user terminal site. That additional graphics data, text data, or spreadsheet data which is received by the central processing hub may be distributed to others of the remote user terminals in the form that it has been received, or it may be processed by the central processing hub and distributed as processed data.
It is an object of the present invention to provide a multimedia conferencing system including a central processing hub, whose architecture is such that the system is dynamically configurable.
A further object of the present invention is to provide a central processing hub which will function as a multimedia platform that supports a family of products having differing communication protocols, differing transmission rates, and even differing signal handling technologies at respective remote user terminals.
Still further, the present invention provides a system whereby a plurality of individual input streams having varying speeds and protocols may be controlled in such a manner that the returned video data signal which is received by each of the plurality of remote user terminals from the central processing hub is derived from a cascade of video links within the central processing hub.
These and other features of the invention will be described in greater detail hereafter.