This invention relates to an image communication apparatus having an automatic answering and recording function. More particularly, the invention relates to an image communication apparatus which, when an incoming call from an originating terminal arrives in the automatic answering and recording mode, transmits a reply message comprising video and audio to the originating terminal and stores a calling-party message sent from the originating terminal.
A video conference system is one which allows a conference to take place by making possible the mutual exchange of audio and video among video conference sets deployed at remote locations. FIG. 17 is a diagram showing the configuration of a video conference system in a case where individuals at two locations remote from each other carry on a conference. The system includes a network 1 and identically constructed video conference sets 2, 3 connected to the network 1 and provided two locations remote from each other. The video conference sets 2, 3 have respective main units 2a, 3a, display monitors 2b, 3b, cameras 2c, 3c, microphones 2d, 3d and speakers 2e, 3e.
Though the details are not shown, the main units 2a, 3a of the video conference sets are each equipped with:
(1) a video A/D converter for converting an analog video signal, which has been acquired from the corresponding camera 2c or 3c, to digital video data; PA1 (2) a video coder for compressing the digital video data obtained by the A/D conversion; PA1 (3) an audio A/D converter for converting an analog audio signal, which has been acquired from the corresponding camera microphone 2d or 3d, to digital audio data; PA1 (4) audio coder for compressing the digital audio data obtained by the A/D conversion; PA1 (5) a multiplexer for multiplexing the video data and audio data compressed by the video coder and audio coder, respectively, and outputting the multiplexed data on a line; PA1 (6) a demultiplexer for demultiplexing multiplexed data, which has entered from the line, into video data and audio data and outputting the data to a video demodulator and audio demodulator; PA1 (7) the video demodulator for demodulating the compressed video data to the original video data; PA1 (8) a D/A converter for converting the demodulated digital video data to an analog signal and entering the analog signal into the corresponding monitor 2b or 3b; PA1 (9) an audio demodulator for demodulating the compressed audio data to the original audio data; PA1 (10) a D/A converter for converting the demodulated digital audio data to an analog signal and entering the analog signal into the corresponding speaker 2e or 3e; and PA1 (11) a line interface for sending the output signal of the multiplexer to the line and entering multiplexed data, which has entered from the line, into the demultiplexer.
The video conference sets 2, 3 thus AD-convert analog video and audio signals acquired from the cameras 2c, 3c and microphones 2d, 3d, apply coding processing (compression) to the video data and audio data obtained by the A/D conversions, multiplex the video data and audio data obtained and sent the multiplexed data out on the line. The video conference sets 2, 3 further demultiplex signals (multiplexed data), which have entered from the line, into video data and audio data, then apply decoding processing (decompression) to the compressed video data and audio data, DA-convert the digital video data and audio data obtained and enter the resulting analog signal into the monitors 2b, 3b and speakers 2e, 3e, thereby outputting video and audio.
Video conference sets having the above-described functions have come into widespread use in recent years and advances have been made in size and cost reduction and in greater functionality. Video conference sets for which there is particularly great demand are those of the type having an automatic answering and recording function that allow a calling party to leave a message if the called party is absent. Such a video conference set is particularly advantageous if there is a time difference between the calling and called parties, as when a connection is made to a party overseas.
FIG. 18 is a block diagram illustrating a video conference set having an automatic answering and recording function according to the prior art. Components identical with those shown in FIG. 17 are designated by like reference characters. The video conference set has the main body 2a, monitor 2b, camera 2c, microphone 2d and speaker 2e. The main body 2a of the video conference set includes a coder 4 having a video coder 4a and an audio coder 4b, a decoder 5 having a video decoder 5a and an audio decoder 5b, a line control unit 6 having a multiplexer, demultiplexer and line interface, etc., and a memory 7 such as a hard disk for storing a reply message comprising video and audio, and a calling-party message sent from the terminal of another (the calling) party.
The reply message RMG is created and stored in the memory 7 beforehand. The reply message RMG will amount to a very large quantity of data if the analog signals from the camera 2c and microphone 2d are merely AD-converted by a preprocessor (not shown). For this reason the analog signals are stored in the memory 7 (route C) once they have been compressed (coded) by the coder 4.
If the video conference set's own terminal set to the automatic answering and recording mode receives an incoming call from the party of another terminal under these conditions, the reply message RMG is first read out of the memory 7 and is transmitted to the calling terminal (route A) via the line control unit 6. Next, after the transmission of the reply message RMG, the verbal message MMG sent from the calling terminal is received via the line control unit 6 and stored in the memory 7 (route B).
The playback of the calling-party message MMG sent from the calling party is performed by reading the calling-party message MMG out of the memory 7 in response to a playback command, entering this message into the decoder 5, whereby the message is decoded, then converting the message to analog signals by a post-processor (not shown) and entering the analog signals into the monitor 2b and speaker 2e.
A technique known as Recommendation H.261 is used in the coding and decoding of video. FIGS. 19 and 20 are diagrams illustrating the constructions of the video coder 4a and video decoder 5a in accordance with Recommendation H.261. The video coder 4a shown in FIG. 19 includes an information source coder 4a-11 which executes processing (DCT processing, quantization processing and motion compensation processing) for compressing the information of a video signal input CIF/QCIF, a video signal multiplexing coder 4a-12 for executing data-format generation processing (hierarchical structuring processing and variable-length coding processing such as Huffman coding) after compression, a transmission buffer 4a-13 for obtaining a constant transmission data rate, a transmission coder 4a-14 for executing dummy-bit insertion processing when the transmission buffer is empty as well as processing for adding on an error correction code, and a coding controller 4a-15, to which the available capacity of the transmission buffer 4a-13 is applied as an input, for instructing the information source coder 4a-11 and video signal multiplexing coder 4a-12 to increase or decrease the amount of generated information based upon the available capacity of the buffer 4a-13, thereby controlling the amount of data that flows into the transmission buffer 4a-13.
The video coder 5a shown in FIG. 20 includes a transmission coder 5a-11 for executing dummy-bit removal processing and error correction processing, a transmission buffer 5a-12 which assures enough time for processing to decode arriving reception data, a video signal multiplexing coder 5a-13 for segmenting compressed data, and an information source coder which executes processing (inverse DCT processing, inverse quantization processing and motion compensation processing) for decompressing information that has been compressed.
Since the upper-limit value on the number of bits that can be transferred in one second is decided by the transfer rate, the video coder 4a executes coding (compression) processing maintaining such a quality that the number of bits after compression will fall within the upper-limit bit count decided by the actual transfer rate. In other words, the video coder 4a has a quality conforming to the transfer rate and executes compression processing in such a manner that the transfer can be performed at this rate.
(a) First Problem
With the video conference set having an automatic answering and recording function according to the prior art, compression processing in accordance with a fixed transfer rate is applied to the video data to create the reply message RMG, which is stored in the memory 7 (FIG. 18). Consequently, if the actual transfer rate at which communication is performed with an originating terminal after the arrival of the incoming call differs from the fixed transfer rate mentioned above, the reply message will no longer be capable of being transmitted correctly. For example, if a reply message RMG that has been created by execution of compression processing conforming to a transfer rate of 128 kbps is transmitted to a line at a higher transfer rate of 384 kbps, the display on the terminal that originated the call will reproduce video that appears to be fast-forwarded. Conversely, if a reply message RMG that has been created by execution of compression processing conforming to the transfer rate of 384 kbps is transmitted to a line at the lower transfer rate of 128 kbps, the display on the terminal that originated the call will reproduce video on a slow frame-by-frame basis.
In an effort to solve this problem, the conventional practice is to create a reply message for each transfer rate in advance so as to establish correspondence between the reply messages and various transfer rates. Then, when a connection is made, the reply message corresponding to the actual transfer rate is selected and transmitted. With this method, however, creating the reply messages takes considerable time and a memory having a large storage capacity is required to store the reply messages.
Further, with the prior art, there are instances where the user abandons the inclusion of video in a reply message and usually sends only audio as the reply message. However, a method which sends only audio as the reply message in spite of the fact that video can also be sent and received is without merit detracts from the value of the product.
(b) Second Problem
If, during the reception of a calling-party message from the calling terminal following the transmission of the reply message from the called terminal to the calling terminal, video acquired from the camera of the called terminal is transmitted, the circumstances prevailing in the absence of the user of the called terminal will be transmitted to the calling terminal in the form of an image. This is a problem in terms of security. Conventional approaches for solving this problem include lowering camera brightness and sending a dark image without sound after transmission of the reply message is completed, preserving the final video and sending this video without sound, or sending nothing at all. However, this method of presenting a display is very unnatural and gives the observer at the calling terminal an odd impression.
(c) Third Problem
In the prior art, the calling-party message sent from a videophone or video conference set is separated into audio and video, after which separate files are created and stored. With this method of separating and storing the message, it is necessary to synchronize the audio and video when the calling-party message is played back. Establishing this synchronization is troublesome.
(d) Fourth Problem
According to the prior art, there is no limitation upon the file sizes of the reply message file and calling-party message file and upon the number of calling-party messages stored. This necessitates large-capacity memory means for storing these messages. The result is an increase in the size and cost of the apparatus.