The present invention relates to teleconferencing, and particularly relates to a system for video object merging where the merging is in respect of object oriented video data.
Video conferencing or teleconferencing is also referred to, more broadly, as multimedia conferencing. Any multimedia conferencing system may utilise a variety of media types and sources, but particularly utilises live or real-time video and audio sources from a plurality of remote users. Those remote users may be geographically scattered very widely, ranging from being located in different areas in the same office building, to different cities, and even to different continents.
The present invention is particularly directed to a system where there is a plurality of video streams to be handled, each from one of a plurality of different sources. No consideration is given as to whether the video streams all comply with the same protocol, it being recognized that varying processing units for various video streams can be arranged so as to accommodate differing protocols.
However, in order for the object oriented video merging system of the present invention to be operative, the video streams must be in synchronism, each comprising active video frames and blanking intervals between each pair of consecutive active video frames, where all of the active video frames have essentially the same finite time span and repetition frequency.
In any teleconferencing circumstance , there will be at least two participants, and usually many participants. Generally, the total number of participants in a particular teleconferencing conference may be dynamically configurable.
In any event, each participant will generate a video flowxe2x80x94a screen of video data which is organized in frames and blanking intervals between the frames, with video data being present in each frame. Other informational data may be present or injected into the data stream in the blanking interval between active video frames, or may be otherwise communicated between the participants.
Indeed, as will be discussed hereafter, the blanking interval between video frames whether it be vertical and/or horizontal blankingxe2x80x94may be divided into a plurality of time slots whereby priority, identification, and other data, can be assigned to each video screen in preselected or designated time slots.
A variety of multimedia conferencing systems are known, a particular one of which is described in Jang et al U.S. Pat. No. 6,442,758. That multimedia conferencing system includes a central processing hub; and the object oriented video merging system of the present invention may be employed in the video encoding and processing portions of that multimedia conferencing system.
The present invention provides an object oriented video merging system, the principal feature of which is that there is provided a backplane having a common medium video bus which provides a centralized medium onto which a plurality of video objects may be merged. Of course, it will be understood that each object is a two-dimensional shape which is contained within any active video frame. As to the shape of the object, it may and generally will vary from frame to frame, and it may be a simple object or it may be very complex shape. As such, the object is, however, describable by polynomials which may be simple polynomials; or it may be that a complex algorithm might be required to describe the two-dimensional shape which is the video object to be merged.
Polomski, U.S. Pat. No. 5,600,646, issued Feb. 4, 1997, and its continuation and co-terminous U.S. Pat. No. 5,838,664 each teach a video teleconferencing system where digital transcoding is employed so as to obtain algorithm transcoding, transmission rate matching, and spatial mixing. A multipoint control unit allows multiple terminals to send and receive compressed digital data signals, so as to communicate with each other in a conference. The video processing unit which performs algorithm transcoding, rate matching, and spatial mixing, includes a time division multiplex pixel bus and a plurality of processors. In a receive mode, each processor receives and decodes compressed video signals from its assigned terminal and puts the decoded signal onto the pixel bus. In a transmit mode, each processor receives from the pixel bus uncompressed video signals, which are processed and then coded for transmission to a respective assigned terminal. Video decoding time due to motion displacement search is reduced by passing displacement information from the compressed video signals to the encoder to be used directly, or as seed for further refinements of the motion displacement field.
Two United States Patents issued to Lukacs, U.S. Pat. No. 5,657,096, issued Aug. 12, 1997 and U.S. Pat. No. 5,737,011, issued Apr. 7, 1998, each teach a real time video conferencing system and method where a central multimedia bridge is used to combine multimedia signals from a plurality of conference participants into a single composite signal for each participant. Here, each conference participant is given the ability to customize their own individual display of the other participants, including the ability to key in and out selective portions of the display and overlapping display images, together with the ability to identify individual images in a composed video stream by click and drag operations. A chain of video composing modules is employed, to combine video signal streams from any number of conference participants in real time. Each participant in a conference may dynamically change the right of access of other participants to the information that they provide to the conference.
Bruno et al U.S. Pat. No. 5,784,561, issued Jul. 21, 1998, is concerned with on-demand, real-time video conferencing, and provides a system which includes at least one video control system. That control system received an on-demand request for a video conference from a user and then, in real time, allocates video conferencing resources and connects the user with at least one other video conference participant through a circuit switched communications network. Each user is connected with at least one other video conference participant, based on the number of total participants that there are, the number of video ports which are available in the video conference control system, and the available connection paths in the circuit switched communications network.
Ely et al. U.S. Pat. No. 5,796,424, issued Aug. 18, 1998, describes a system and method for providing video conferencing services where a broadband switch network, a broadband session controller, and a broadband service control point are provided. Here, connections are provided between information senders and receivers in response to instructions from the broadband service control point or in response to requests which are originated by any remote information sender/receiver. The broadband service control point provides processing instructions and/or data to the broadband controller and to each remote sender/receiver. The system is particularly directed to video-on-demand utilization. Whenever a user requires a video from a video information provider, the broadband session controller establishes communication between the set top controller at the remote user""s location and the video information provider, requesting processing information from the broadband service control point in response to predetermined triggers. A broadband connection between a video information provider and a specific user is established under control of the broadband session controller. If the system is to be used in video conferencing, the set top controller will control cameras, microphones, and so on. Telephone services may also be provided over the same integrated network.
Aras, et al, U.S. Pat. No. 5,867,653, issued Feb. 2, 1999, teaches a method and apparatus for multi-cast based video conferencing. Here, a multi-cast server sets up a multi-cast over a point-to-multipoint connection, which connects all multi-cast clients along with a primary multi-cast client. The primary multi-cast client is connected to the multi-cast system via a point-to-point link. An arbitrator is established; and when a multi-cast client wishes to speak, a speaking request is sent to the arbitrator who then determines when to grant or deny the speaking request. In each case, there is a video stream which is set up, so that when permission by the arbitrator has been granted to a requesting multi-cast client, that client then provides a new video stream to multi-cast server over a newly established point-to-point connection, and the multi-cast server switches or provides the video stream from that client to the point-to-multipoint connection for the benefit of the other clients.
Wang, et al, U.S. Pat. No. 5,903,673, issued May 11, 1999, teaches a digital video signal encoder and encoding method whereby image quality is maximized without exceeding transmission bandwidth. Here, there is concern about the size of the encoded frames, which may require to be quantized so as to adjust the various encoded frames closer in size to the desired size. If the cumulative bandwidth balance deviates from a predetermined range, quantization is adjusted as needed to improve image quality to more completely consume available bandwidth, or to reduce image quality to thereby consume less bandwidth. Rapid changes are detected, and quantization is precompensated according to the rate of change.
The present invention provides an object oriented video merging system for use in teleconferencing, where a plurality of video objects are contained within the active video frames of a plurality of video screens from a plurality of different sources, at any instant in time. At least an object from one discrete video screen is to be merged into a composite video output signal for return to a plurality of selected sources at each instant in time. Each object in each video frame is describable by a polynomial which defines a two-dimensional shape.
The system includes a system controller, and a clock means for generating a clock stream. There is a master frame pulse generating means for periodically imposing a master frame pulse on the clock pulse stream.
A plurality of video processing units is provided, each for processing video data from one of a plurality of video sources. The processing of video data is carried out on a frame-by-frame basis.
There is a backplane which has at least a data bus, and a common video bus onto which an object from one discrete video stream is merged at any instant in time from a respective video processing unit communicating with that common video bus.
An object description generator is provided for each video screen which is being processed by each respective video processor. A video priority determinator is found in the system controller, for determining the priority of each video stream at each instant in time.
Video selection means are provided in each video processor unit for permitting an object from the video stream being processed by the respective video processor unit to be transmitted to the common video bus, but only if the priority of that video stream at any instant in time permits such transmission.
In keeping with a particular aspect of the present invention, each video processing unit comprises a video preprocessor, an object description generator, and a video priority register for receiving and storing data received from the system controller relating to the priority of an object contained in a video frame being processed by the respective video processing unit at each instant in time.
There is also a descriptor/video time division multiplexing controller which controls a multiplexer so as to pass either video frame data or object description and video priority data from the video preprocessor or the object description generator and priority register, respectively, at any instant in time. There is also a video selector for collecting relevant data from the backplane so as to determine if the video frame being processed by the respective video processing unit at that instant in time can be released to the common video bus.
In another aspect of the invention, mutually exclusive predetermined time slots are established in the blanking interval between active video frames for each video screen being processed by each respective video processing unit. Those mutually exclusive predetermined time slots are assigned to each respective video processing unit. Thus, the object description for the object in each video frame, as generated by the respective object description generator, is broadcast to all video processing units for detection and handling by the respective video selector in each video processing unit, during the blanking interval.
The predetermined time slots for each respective video processing unit may be under the control of the system controller; or they may hard wired into each respective video unit.
In the present invention, the multiplexer in each respective video processing unit is under the control of the descriptor/video time division multiplexing controller, so as to pass signals from the object description generator to the respective data bus on the backplane, during the blanking intervals between active video frames of the respective video stream being processed by each respective video processing unit.
Of course, only a single video processing unit will pass a video object signal to the common video bus during any active video time slot, and the video object signal having the highest priority at that instant in time when there is an active video time slot is the video object signal which is passed to the common video bus.
The present invention provides that, in some circumstances the master frame pulse generating means may be included in a single designated video processing unit, the selection of which is under the control of the system controller.