The present invention relates to a real-time audio transmission apparatus for transmitting audio data in real time on a network of asynchronous communication such as Ethernet.
Recently, in bus-format Local Area Network (LAN) or asynchronous communication such as Ethernet and Asynchronous Transfer Mode (ATM ), improvement of quality is demanded in the real-time audio transmission apparatus for transmitting audio data in real time.
FIG. 10 is an explanatory diagram showing a first example of communication system using a conventional audio transmission apparatus, which shows an example of audio transmission by packet data using a communication network with a constant delay time.
The communication system in FIG. 10 comprises a transmission side audio transmission apparatus 1001a, a reception side audio transmission apparatus 1001b, a communication network 1011 with a constant delay time, and a receiving buffer 1003. Generally, in the case of the communication network 1011 with a constant delay time, the audio packet transmitted at a specific interval is also received at a specific interval at the reception side, so that a continuous audio reproduction is realized.
FIG. 11 is an explanatory diagram showing a second example of communication system using a conventional audio transmission apparatus. Referring now to this diagram, a communication system using a communication network involving delay time fluctuation is explained. A communication network 1111 is an asynchronous communication network such as Ethernet. In the case of the asynchronous communication network 1111, since an irregular change of delay time, that is, a delay time fluctuation occurs, if the delay is significant, the audio data of the receiving buffer 1103 is empty, the sound is lacking, and the audio quality deteriorates.
As a countermeasure of this problem, assuming a maximum delay time fluctuation of the communication network, the audio data for this time duration is stored in a receiving buffer 1103 in advance. Then, in the event of a delay time fluctuation, the audio data stored in the receiving buffer 1103 is reproduced, so that a continuous audio reproduction without pause is realized.
This countermeasure, however, needs to determin the maximum delay time fluctuation of the communication network. At present, since there is no standard of tolerance level of delay time fluctuation in communication network, it is not clear how much the maximum delay time fluctuation is, that is, it is not clear how much audio data should be stored in the receiving buffer 1103. The capacity of the receiving buffer 1103 (the maximum capacity of storing the audio data) has been determined uniformly on the basis of the assumption of the maximum delay time fluctuation of the communication network.
FIG. 12 is a block diagram showing a conventional audio transmission apparatus, in which an audio packet is received by using an asynchronous communication network such as Ethernet. In FIG. 12, a real-time audio transmission apparatus 1201 comprises a network-interface (communication network I/F unit) 1202, a receiving buffer 1203, an audio decoder 1204, a D/A converter 1206, a buffer controller 1208, an asynchronous communication network 1211 such as Ethernet, and an audio reproduction switch 1213.
In thus composed audio transmission apparatus, an outline of operation is described below. The buffer controller 1208, initially, turns off the audio reproduction switch 1213 until a specific amount of audio data is stored in the receiving buffer 1203, and does not reproduce the sound. When storage of specific amount of audio data in the receiving buffer 1203 is detected, the buffer controller 1208 turns on the audio reproduction switch 1213. Then, the audio decoder 1204 and D/A converter 1206 start their operation, and sound reproduction begins. As far as the delay time fluctuation in the communication network 1211 is within the reproduction time of the audio data stored in the receiving buffer 1203, the continuous reproduction is possible by reproducing the stored audio data until the next audio packet gets into the receiving buffer 1203.
In this countermeasure, the capacity of the receiving buffer (the maximum capacity of storing audio data) has been fixed uniformly on the assumption of the maximum delay time fluctuation of the communication network. This is, however, merely a prediction. If a delay time fluctuation more than a reproducing time of stored audio data actually occurs, the stored audio data becomes empty until next audio packet is received, and audio data under-run occurs and then a lack of reproduced sound happens, and the reproduced audio quality deteriorates.
An amount of audio data stored in the receiving buffer itself may cause more audio delay time. Therefore, it should be avoided to store too much data, so as to keep the delay time low. Accordingly, the storing amount of audio data has been determined by actually investigating the communication network or by a method depending on experiences. If the storage amount is thus determined, in the case of worsening of operation condition of communication network, audio quality deterioration occurs. Therefore in order to maintain audio reproduction of high quality, the quality of communication network must be maintained higher than a specific level, but it means higher cost and it is difficult to realize.
If the delay time fluctuation of communication network is constant, in the case of failure in achieving clock synchronism between communication devices, the audio data may becomes empty or overflow in the receiving buffer when the reception state lasts for a long time, and the sound may lack audio quality.
Between devices for real-time audio data communication, in order to synchronize in clock, it is general that both sides clock synchronize with the communication network. However, in the case of asynchronous communication network not having clock synchronizing means in the communication network itself such as Ethernet, a portion of audio data denoting sound is detected, and only this portion is sent out in packet, and the timing is adjusted at the receiving side in duration of audio data stream denoting no sound, so that the real-time reproduction is maintained.
Referring now to FIG. 13, the problem occurring due to a difference between the transmitting clock frequency and receiving clock frequency in the case of communication through an asynchronous communication network is explained. FIG. 13 shows a transition state of buffer storage amount in the receiving audio transmission apparatus 1201 when the transmission coding clock frequency is larger than the receiving decoding clock frequency in the conventional audio transmission apparatus. Reference numerals 1310 and 1312 at the upper side of the diagram show data blocks in time duration when writing the audio data received in the network-interface 1202 in FIG. 12 into the receiving buffer 1203, 1311 and 1313 show data blocks of time duration when reading out audio data from the receiving buffer 1203 into the audio decoder 1204. First, the time duration 1301 and 1305 in the diagram represent the time from the start of reception of audio data from the communication network 1211 in a state not having data to be read out from the receiving buffer to the time when reading is started from the receiving buffer 1203 as the buffer amount exceeds a certain threshold (START) 1308.
Time duration 1302 is the time duration of simultaneous writing and reading of the receiving buffer 1203, in which the transmitting coding clock frequency is higher than the receiving side decoding clock frequency, and therefore the storage amount of the receiving buffer is slightly increased along with the lapse of time. During the time duration 1303, only the reading is made and the writing is terminated. In the time duration 1304, the receiving buffer is empty, and no sound is reproduced.
As shown in FIG. 13, if the data block is long as in the case of reception data 1312, while writing and reading are done simultaneously in the receiving buffer 1203 in time duration 1306, the storage amount exceeds FULL 1309 in time duration 1307. Then, the receiving buffer overflows in the portion of reception data 1314. Thus, if the transmitting coding clock frequency is even slightly higher than the receiving decoding clock frequency, as shown in FIG. 13, the storage amount of the receiving buffer slightly increases along with the lapse of time, and the receiving buffer overflows. This overflow duration corresponds to the time duration 1307 in the diagram, and the audio data 1315 is lacking in this period, and the audio quality deteriorates.
It is hence an object of the invention to present an audio transmission apparatus capable of reproducing the audio continuously, by avoiding troubles such as empty audio data to be reproduced or lacking of audio due to overflow of receiving buffer, regardless of the quality of the communication network
To solve the problems, the real-time audio transmission apparatus of the invention comprises a receiving buffer for storing the data block received from the communication network temporarily, and a buffer storage amount controller for monitoring the data amount stored in the receiving buffer, in which the readout speed of reading audio data from the receiving buffer is varied depending on the monitoring result of the buffer storage amount controller. Therefore, the storage amount of the receiving buffer is controlled appropriately, and the audio data in the receiving buffer is prevented from being empty or overflowing, so that audio data can be transmitted continuously in real time.
The audio transmission apparatus of the invention also includes a delay time fluctuation measuring section for measuring the delay time fluctuation which is a variation width of irregular delay time from the receiving time duration of audio packet being received. On the basis of the delay time fluctuation measured in the delay time fluctuation measuring section, the storage amount of data in the receiving buffer is controlled. Therefore, if the delay time of the communication network varies, overflowing of receiving buffer and lacking of audio data due to empty receiving buffer during audio data transfer can be avoided, and the audio data can be transmitted continuously in real time.