The present application relates to an audio communication system that performs transmission and reception, i.e., communication of audio signals having high sound quality.
FIG. 3 shows one example of previously proposed or existing audio communication systems, such as video conference or phone units or systems (unless otherwise specifically stated or mentioned, the term “audio” hereinbelow is used to refer to any one of such terms “voice” and “speech”). An auxiliary input terminal 101 (AUX) is a terminal for connecting a sound source, such as CD (compact disk). A signal input from the auxiliary input terminal 101 (AUX) is sampled at, for example, a sampling frequency of 48 kHz, and converted by an AD convertor 201 (ADC) into a digital signal.
A microphone input terminal 102 (MIC) is connected to a microphone for collecting audio, for example. A signal input from the microphone input terminal 102 (MIC) is sampled at, for example, a sampling frequency of 48 kHz, and converted by an AD convertor 202 (ADC) into a digital signal. Echo output from a speaker of a remote terminal is entrained into the signal, such that an echo cancellation process is carried out using an echo canceller 402.
In the echo canceller 402, a signal component (echo) entrained into the microphone input signal from the speaker output of the remote terminal is cancelled. For cancellation of the signal component (echo), a true signal component from the remote terminal is referenced.
The microphone input signal from which the echo has been cancelled is mixed by necessity with the output of the AD convertor 201 (ADC). Thereby, a transmission signal at a sampling frequency of, for example, 48 kHz for transmission to the remote terminal is produced.
An output of an adder or summing device 602 is compressed in a compression process in an encoder 702, thereby to form a bitstream that is transmitted to the remote terminal across, for example, an IP network.
The bitstream, which has been transmitted from the remote terminal, is restored or decoded by a decoder 703, thereby being formed as a received signal of a sampling frequency of, for example, 48 kHz. The received signal from the remote terminal is mixed, by necessity, in a summing device 603 with the output of the AD convertor 201 (ADC) (signal from the auxiliary input terminal 101 (AUX)), thereby to form a source signal for a speaker output.
In the event that the source signal for the speaker output is directly DA converted and output, a case can take place in which, for example, skew occurs depending on the sound volume of the signal, or echo cancellation cannot be properly done by the echo canceller 402 when the signal is a multichannel signal such as stereo signal. As such, before the signal is input as the reference signal into the echo canceller 402, the signal is subjected to necessary preprocesses in a preprocessor 503. The preprocess in the preprocessor 503 includes, for example, compression (quick compression of large amplitude portions of the signal) and elimination of inter-channel correlative components. Further, when the sound volume (volume) has to be tuned, also volume tuning is carried out by the preprocessor 503.
The output of the preprocessor 503 is supplied to a DA convertor 203 (DAC), and is used also as a reference signal in the echo canceller 402. In the DA convertor 203 (DAC), the output signal is restored into an analog signal at a sampling frequency of, for example, 48 kHz. Thereby, a signal within a signal band range of up to 22 kHz is output to a speaker output terminal 103 (SPK).
Recording signals can be selectively provided in the system. Recording signals include (output of the echo canceller 402) for the use of transmission of a microphone input signal of the system's own terminal to the remote terminal, a signal (output of the decoder 703) from the remote terminal, and by necessity, a signal (output of the AD convertor 201 (ADC)) incoming from the auxiliary input terminal 101 (AUX). These recording signals are mixed in a summing device 604 and are restored by a DA convertor 204 (DAC) at a sampling frequency of, for example, 48 kHz to an analog signal. Thereby, a signal within the signal band range of up to 22 kHz is produced and output to the recording terminal 104 (REC).
In the above-described example, the sampling frequencies of the AD/DA convertors are 48 kHz maximum. An audio communication system of the above-described type is disclosed in Patent Publication 1 (see Japanese Unexamined Patent Application Publication No. 2002-262251).
In the case of the sampling frequency of 48 kHz, signals within the range of up to 22 kHz can be represented. The level of audio or sounds acoustically perceptible by human beings is said to be a frequency range of up to about 20 kHz, which is reproducible with the audio of a CD or the like. However, a maximum level of sound quality level presently demanded exceeds the level of quality achievable by the example of previously proposed or existing audio communication systems. Actual sounds of musical instruments and in the natural world include sounds of significantly higher tone (frequency) than 22 kHz. It is said that when sounds in the tone range of up to such a high level are reproduced, the sound quality is apparently improved. In order to reproduce such high tone audio, the systems, contents, and the like of super-audio CDs and DVD audio are distributed and circulated among commercial markets seeking for high sound quality.
The example of the previously proposed audio communication systems described above has problems in terms of sound quality and hardware costs presently demanded, as described herebelow.
(1) Sound sources contemplated in the example of previously proposed or existing audio communication systems are specialized in CD-level equivalent audio. The sound quality of the above-described audio as a communication is sufficient for clearly understanding human speech voice. However, the sound quality is not sufficient to enable fidelity reproduction of a musical audio signal including a reproduction frequency band exceeding the level of CD audio, such as super-audio CD or DVD audio.
(2) In the case that a signal including the reproduction frequency band exceeding the level of CD audio is processed simply by the example of previously proposed methods or its enhanced method, the hardware scale is significantly enlarged. Particularly, the echo canceller indispensable for two-way audio communication requires a very large number of calculations in order to provide high performance. While the CD-level equivalent frequency band is sufficient for the reproduction of audio as a communication, a wide reproduction frequency band of the echo canceller has to be provided corresponding to musical audio signals inclusive of CD-level exceeding reproduction frequency band. This leads to a significant cost increase.