1. Field of the Invention
The present invention relates to a method and apparatus for communicating one or more video frames from a source having a video input means to a destination having a visual display unit during the course of a telephone conversation.
2. Related Art
Known multimedia communication terminals are integrated devices, that is, they contain both audio and video communication portions linked together with internal circuitry and/or software which synchronize audio and video digital data or analogue signals (herein referred to collectively as xe2x80x9csignalsxe2x80x9d), and hence corresponding audible and visible output from the integrated terminal. Such synchronization is particularly desirable in order to achieve so-called xe2x80x9clip-synchxe2x80x9d in audio-video telephony. In practice, for the signals to be effectively synchronized, it has been found in the broadcast industry that the visible output should lead the audible output by no more than about 20 ms, and should lag the audio signal by no more than about 40 ms.
Examples of known integrated terminals operating according to the ITU-T Recommendation H. 320 xe2x80x9cNarrow-band Visual Telephone Systems and Terminal Equipmentxe2x80x9d include those sold by PictureTel Corporation under the trade mark the Venue-2000 and those sold by VTEL Corporation under the trade mark Enterprise Series Room System TC1000. Intel Corporation sells a business video conferencing system under the product code PCVD1013ST that operates according to the H. 320 and H. 323 standards. An example of a known integrated terminal operating according to the H. 324 standard is those sold by 8xc3x978 Inc. under. the trademark ViaTV Phone.
Such known terminals are designed to function in an audio-only mode, so that these can function as a simple telephone when communicating with anther telephony terminal.
Such integrated devices have yet to become widely adopted, and one reason for this is that whilst such devices may function as a telephone, users still need a conventional telephone for communication, for example with other telephones on the same PBX exchange, or with external telephones. Many PBX manufacturers now support ISDN lines to the desktop for the provision of multimedia communication terminals, according to the H. 320 standard. Unfortunately, different manufacturers provide different levels of functionality (for example, features such as call hold, call transfer and call forward), not all of which are supported by multimedia terminals. Again, the result is the need to have more than one telephony device on a desktop.
Most users therefore end up with two telephony devices on their desktop. This is inconvenient, owing to the extra desktop space normally required, as well as the need to have a different telephone number for each device. Callers must therefore keep track of two numbers, and decide in advance which type of call. they intend to place.
The recent emergence of the H. 323 standard for multimedia communications over packet networks, for example local area networks (LANs) using the Internet Protocol over Ethernet, has added further complications for the user, because the multimedia terminal must then in most cases connect to the data network, rather than the telephony network. It is well known that the typical data network is not as resilient or reliable as the telephony network. Many H. 323 multimedia terminals are PC based, the user rightly fears losing his telephony facility when the LAN or his PC crashes or fails. Therefore, the user still needs more than one telephony device.
Furthermore, a critical mass of audio-video equipment does not yet exist. Users and potential users of audio-video telephony are therefore unable to communicate using audio-video telephony with a large number of people whom they may call. This further inhibits the adoption of audio-video telephony.
Another problem concerns security arrangements such as firewalls for computers on data networks, for example LAN""s and Intranets, in order to prevent unauthorized data from entering or leaving the network. Although firewalls generally permit activities such as the receipt and sending of e-mail messages or the browsing of web pages, these firewalls are not normally compatible with packet-based audio-video communication using the data network such as those conforming to H. 323 or SIP (Session Initiation Protocol). Therefore, even if two users had the requisite audio-video terminals, they would not be able to communicate if on opposite sides of the firewall. Although special firewalls (e.g. H. 323) can in principle be used to allow such communication, network managers may be unwilling to invest in such special equipment simply because the demand for audio-video communication is too low.
It is an object of the present invention to provide a more convenient audio and video communication method and system in cases where it is not necessary for video to be synchronised with the audio in the course of an audio-video telephone call so that both parties need not be equipped with fully capable multimedia terminals.
According to the invention, there is provided a method of making an audio-video telephone call using an audio and video communication system that comprises: a first telephony device and a second telephony device that may communicate with each other over a telephony communications network; an audio-video service provider that may communicate over said telephony communications network with both the first and second telephony devices respectively along a first communication path and a second communication path; a first computer and a second computer, said computers being in proximity with respectively the first telephony device and the second telephony device, at least one of the computers having a video input means and the first computer and second computer each having a data input means, a visual display unit, and communication, means by which the first computer and second computer may make a connection to the service provider respectively along a third communication path and a fourth communication path; in which the first telephony device and first computer are operable by a first user and the second telephony device and second computer are operable by a second user, wherein the method comprises the steps of:
a) using the first computer communication means to make a first connection to the service provider;
b) initiating an audio telephone call between the first telephony device and the second telephony device via the service provider and the first and second communication paths;
c) communicating caller-id data between the first user and the service provider along the first communication path and/or the third communication path so that the service provider can correlate both the call and the first connection with each other;
d) using the second computer communication means to make a second connection to the service provider;
e) communicating an access code from the service provider to the first user, and then communicating in the audio telephone call said access code from the first user to the second user;
f) communicating said access code between the second user and the service provider along at least the fourth communication path so that the service provider can correlate both the call and the second connection with each other;
g) uploading at least one video image from the video input means to the service provider;
h) downloading said video image(s) from the service provider to the first computer and/or second computer to be displayed on the respective video display unit(s) when the second user has answered said telephone call and when both the first and second connections are correlated with the call.
The video image(s) may be uploaded and/or downloaded via the third communication path and/or the fourth communication path. If the communication means includes means by which the first computer and second computer may make a connection to the service provider respectively along a fifth communication path and a sixth communication path, the method may comprise the steps of:
i) communicating to the service provider the video communication capabilities of the first computer and/or second computer;
j) making a connection between the service provider and the first computer and/or between the service provider and the second computer along the fifth communication path and/or the sixth communication path;
k) uploading at least one video image from the video input means to the service provider along the fifth communication path and/or sixth communication path; and/or downloading at least one video image from the service provider along the fifth communication path and/or sixth communication path to be displayed on the respective video display unit(s).
The term xe2x80x9ctelephony devicexe2x80x9d as used herein means any device that can be used for speech telephony and the term xe2x80x9ctelephony communications networkxe2x80x9d means any communications network that can be used to carry speech telephony signals. In the basic embodiment of the invention the telephony device is a simple telephone and the telephony communications network is the Public Switched Telephone Network (PSTN) or the Mobile Network. In other embodiments of the invention, the telephony device may be a LAN based telephone or a sound card, microphone and headset in a PC, and the telephony communications network may be the data network. Also the telephony device may even be part of a multimedia terminal. However, for the purposes of this document xe2x80x9ctelephonexe2x80x9d is used to mean xe2x80x9ctelephony devicexe2x80x9d in the broadest sense, and likewise xe2x80x9ctelephone networkxe2x80x9d is used to mean xe2x80x9ctelephony communications networkxe2x80x9d in the broadest sense.
Likewise, the placement of the telephone call (routed) xe2x80x98via the service providerxe2x80x99 is also meant in the broadest sense. The service provider may indeed terminate both legs (first and second communication paths) of the telephone call and route the audio through its own equipment. In this case the first and second communication paths carry call signalling information and also speech. This may be the case, for, example, when the service provider requires the first user to enter caller-id data via their telephone keypad. Alternatively, the service,provider may just have knowledge of the telephone call between the first and second user. In this case the service provider has notification or signalling interfaces to/from the telephony communications network and the first and second communication paths carry only signalling information, and the audio media travels directly through the telephone communications network between the telephony devices. Examples of such signalling interfaces includes computer telephony integration (CTI). interfaces to PBX""s, and interfaces to the Intelligent Network (IN) in the PSTN. When the telephony network is an H. 323 packet network, such a signalling interface can be provided through the GateKeeper interface for example.
The term xe2x80x9cmultimedia terminalxe2x80x9d as used herein means any device or collection of devices used for audio video and optionally data communications. The multimedia terminal may be formed from separate audio and video communication devices, for example a telephone and a video display unit of a personal computer optionally equipped with a camera and capture card. Alternatively, the multimedia terminal may be formed from the combination of an integrated multimedia terminal having both audio and video capability, but using only the video function, in combination with a separate audio device, for example a telephone. Finally the multimedia terminal may be an integrated multimedia device.
The term xe2x80x9ccaller-id dataxe2x80x9d as used herein means any type of data or information that serves to identify the first or the second user, the association between their telephony devices and their computers, and/or their calls and connections. Examples of caller-id data include telephone numbers, computer network addresses and access codes, including security codes and account codes.
Either user may communicate their telephone numbers and their computer network addresses on their computer connections (third and fourth communication paths), and the service provider may make the association with their telephones, either by calling the telephones using the previously inputted telephone number or by matching the telephone number presented by the telephony communications network (Calling Line Identifier and Called Party Number services) against the previously inputted telephone numbers.
When the service provider cannot immediately associate a user""s telephone device and computer and the corresponding calls and connections (because, for example the user""s telephone number has not been presented by the network and/or there is no computer connection), the use of access codes may serve as caller-id data. The service provider may require the use of access codes for security or billing purposes in addition to other forms of caller-id data. For example, the service provider may require the first user to enter an access code (e.g. account number and/or PIN) before allowing the telephony call to the second user to proceed. The access code entered will identify the first user to the service provider and allow the correlation with his computer to be made provided his computer connection is also made to the service provider. Another example of caller-id data includes an access code communicated from the service provider to the first user along the third communication path as a series of numbers, letters or other symbols displayed upon the first computer visual display unit. The first user could then communicate these symbols back to the service provider along the first communication path, for example by using a touch tone key pad associated with the telephone, or if the service provider has voice recognition means, by speaking the access code back to the service provider.
In the basic embodiment of the invention, the first user initiates the telephone call to the second user from his telephony device. This need not always be the case. The invention is also applicable when the second user initiates the telephone call from his telephony device to the first user. Likewise, in a more sophisticated embodiment of the invention, the first user may initiate the telephone call from his computer along the third communication path. Caller-id data may contain his telephone number as well as the telephone number of the second user. The service provider may then place telephone calls out to both first and second users along the first and second communication paths connecting their telephony devices together.
Because the telephone call to the second user is routed via the service provider, association of calls and connections may conveniently take place during the initiation of the telephone call, before the call has been forwarded to or, answered by the second user. Then, by the time the second user has answered the telephone call, the service provider has already associated the connection to the first user""s computer with the telephone call.
Images can then be uploaded from the first user""s computer to the service provider.
The caller-id data communicated between the second user and the service provider includes caller-id data communicated by the first user to,the second user. For example, once the call has been answered, the first user may then communicate verbally to the second user the same access code. The second user can then use his computer to make a connection to the service provider, enter the access code, for example via a computer keyboard, in order to be able to download one or more images already uploaded to the service provider.
If the caller-id data is sufficient to identify the second user and make, the correlation between his telephone device and computer immediately, without further input from him, then the service provider can automate the downloading of images to his computer once the telephone call has been answered.
Preferably, such video images are automatically downloaded at frequent intervals in order to approximate real time video. Indeed, it is the function of the audio-video service to provide the highest level of video communication consistent with the user""s computer video capability. For example, if both first and second user do indeed have fully functional multimedia terminals then real-time video communication may be possible. If one user has no camera, for example, but does have a video decoder on his PC, then only one-way video communication may take place.
The video communication capabilities of a user computer may already be known from subscription and log-in parameters or may be determined dynamically through the execution of an applet on the user computer that communicates with the service provider. The eventual level of video communication will also be affected by the action of any firewall and the amount of bandwidth available.
In a basic embodiment of the invention, said video images are downloaded manually by the first and/or second user, for example by activating a refresh or reload function in web-browser software. In some cases, this may be the highest level of video capability that can be achieved because of firewalls or because one user has no video camera and/or video codec installed. In this case video is sent and/or received on the third and/or fourth communication paths, that is, the same communication paths that are used for data communications between the user""s computers and the service provider.
Where firewalls may need to be bypassed and/or more bandwidth needed for improved video communications, additional communication paths, a fifth communication path and a sixth path may be provided between the user""s computer and the service provider. The fifth and sixth paths are intrinsically associated with the third and fourth communication paths.
Where there are mechanisms for transferring such media through firewalls and/or sufficient bandwidth is available then the fifth and sixth communication paths may be on the same or different physical circuits as the third and fourth communication paths.
Therefore, the system may be capable of permitting the service provider to determine the video communication capabilities of the first computer and/or second computer for example via an applet received from the service provider by the first or second computers, or through subscriber login parameters, in order to provide a higher quality of video communication consistent with these capabilities and any restrictions imposed by firewalls and/or the availability of bandwidth That is, the method and system according to the invention are consistent with fully functional multimedia terminals when firewalls are avoided (for example by connecting via access routers over the PSTN or other suitable networks) or where mechanisms to route such traffic through firewalls exist.
In all cases, it is preferred if images can no longer be downloaded when the telephone call is ended. This provides a degree of security against unauthorised downloading of such images.
In the above example, the caller-id data at all stages is the same. This need not be the case, however, and caller-id data at each stage can be different as long as the service provider is able to make the necessary correlations.
In the above description, the initiation of the video element of the call is instigated by the first user (i.e. the user who has initiated the telephone call). This need not be the case. The second user or the user receiving the telephone call may instead instigate the uploading and downloading of video images, particularly if the second user has video input equipment. The system and methods described above apply equally in this case.
In one embodiment of the invention, the connection for the first computer and/or the second computer is made through the Internet. In this case, the communication means includes web-browser software, the downloading and/or the uploading of said image being from/to a web page.
Alternatively, the connection for the first computer and/or the second computer may be made directly to the service provider through the public switched telephone network or other suitable networks, for example via an access router either connected directly to the computer, or-via a local area network. Such an access router can therefore bypass any firewalls on the local network, but may be programmed only to dial into the service provider which can then be provided with its own firewall in order to safeguard the network. Such a direct connection can provide the benefit of a higher data rate than may be achievable over the Internet, in order to provide quick transmission of images.
In a more sophisticated embodiment of the invention, the connection for the first computer and/or the second computer may use both the Internet and the PSTNxe2x80x94the Internet for data communication and the PSTN for video communication.
The invention is also applicable to future networks that provided high bandwidth Internet connection, for example those using Cable Modem or DSL/ATM technologies. Provided mechanisms exist for passing such traffic through firewalls (if they are present), these Internet connections may be used for both data communication, video communication and even audio communication.
Alternatively, if both parties in a call belong to the same corporate organisation, the enterprise""s Intranet and/or LAN made be used for the video communication paths.
The telephony communications network will in general comprise the Public Switched Telephone Network. The telephone network may also or alternatively comprise a local private telephone network, for example a PBX network, or the mobile network. Future telephony communications networks may also use the data network such as an Intranet or LAN, or future Internet networks.
In one embodiment of the invention, the caller-id data between the first user and service provider is communicated over both the first communication path and the third communication path to the service provider, rather than from the service provider.
Alternatively, the caller-id data between the first user and service provider may be communicated over the third communication path from the service provider to the first user, and over the first communication path from the first user to the service provider.
In a preferred embodiment of the invention, the service provider receives via the telephone network- caller-id data including the telephone number of the first telephone. Such caller-id data may be typed into a touch tone telephone by the first user, but most conveniently this caller-id data is generated automatically by the telephone network when the first user makes the telephone call.
Alternatively, the service provider may call the first telephony device using information supplied by the first user over the third communication path or from previously supplied subscriber or log-in information. The service provider may additionally call the second user when the information supplied by the first user over the third communication path also includes the telephone number of the second user.
Similarly, the caller-id data communicated between the second user and the service provider may include the telephone number of the second telephone.
If the service provider comprises computerised voice recognition means, at least some of such caller-id data may be communicated to the service provider by voice. Alternatively or additionally, if a telephone has a touch tone keypad and the service provider has tone recognition means, this may be used to communicate at least some of the caller-id data to the service provider.
When the data input means comprises a keyboard or a computer mouse, at least some of such caller-id data may be communicated to the service provider with the keyboard or mouse.