1. Field of the Invention
The present invention relates to an audio playback apparatus and method for controlling pause and resume of audio. Particularly, the present invention relates to an audio playback apparatus used for conversation in an IP (Internet Protocol) phone and in an IP video telephony.
2. Description of the Related Art
FIG. 1 illustrates a functional configuration of a preceded digital audio playback apparatus that is not known in public.
The preceded digital audio playback apparatus 1 shown in FIG. 1 is configured to receive packets containing compressed audio data and play back audio in the packets. This apparatus 1 has an audio packet receiver section 11, an audio decoder 12, a buffer 13, a switch 14, a D/A converter section 15, an amplifier 16 and an initial buffering judgment section 17.
The audio packet receiver section 11 receives packets containing compressed audio data from the network, and transmits the compressed audio data to the audio decoder 12. The audio decoder 12 decodes the compressed audio data into non-compressed PCM (pulse code modulation) data and outputs the PCM data to the buffer 13. The buffer 13 temporarily stores the PCM data and outputs them to the D/A converter section 15 through the switch 14. The initial buffering judgment section 17 monitors the amount of data stored or buffered in the buffer 13 and controls based upon the buffered data amount on/off of the switch 14. The D/A converter section 15 converts the PCM data input through the switch 14 into an analog signal and outputs the converted analog signal to the amplifier 16. The analog audio signal output from the amplifier 16 is provided to the speaker 2 to play back the audio.
The initial buffering judgment section 17 is provided with a NOR gate 171, a comparator 172, an RS flip-flop 173 and an initial buffering value storage 174. The NOR gate 171 outputs an “H” level signal to the flip-flop 173 when the buffered data amount becomes zero. The comparator 172 compares the buffered data amount in the buffer 13 with an initial buffering value and outputs a “H” level signal to the flip-flop 173 when the buffered data amount becomes equal to or higher than the initial buffering value stored in the storage 174. The flip-flop 173 controls turning on/off of the switch 14 depending upon whether it is set or reset.
Hereinafter, operation of this audio playback apparatus will be described.
Under initial conditions, no PCM data is stored in the buffer 13. Therefore, “0” is input to the NOR gate 171 and thus its output becomes “H” level signal. As a result, the flip-flop 173 is set to turn the switch 14 off, so that PCM data output from the audio decoder 12 are stored in the buffer 13 without being output to the D/A converter section 15.
When the buffered amount of the PCM data in the buffer 13 becomes equal to or higher than the initial buffering value, the comparator 172 outputs an “H” level signal. Thus, the flip-flop 173 is reset and the switch 14 turns on, so that the PCM data stored in the buffer 13 is provided to the D/A converter section 15 and that the analog audio signal is provided to the speaker 2 through the amplifier 16 to play the audio back.
In such audio playback apparatus where packets containing audio data are received and audio in the packets is played back while receiving another packets, if receiving timings of the packets vary due to changes in the transmission rate through the network, it may occur that no packet can be received for a time period longer than the time equivalent to the buffered amount. In such case, all the buffered data may be extracted from the buffer to fall into an underflow state.
When the buffered amount of the PCM data in the buffer 13 becomes zero or underflows, the output of the NOR gate 171 becomes the “H” level signal, the flip-flop 173 is set and thus the switch 14 turns off. Thus, the PCM data output from the audio decoder 12 are not fed to the D/A converter section 15 but stored in the buffer 13. Then, when the buffered amount of the PCM data in the buffer 13 becomes equal to or higher than the initial buffering value, the switch 14 turns on.
During a period where the switch 14 is off state, playback of audio is paused and therefore break or interruption of voice occurs. This operation of the buffer during the off state of the switch is called as a re-buffering operation.
If the audio packets arrive without delay, the buffered amount of the data will not lower than the initial buffering value. However if delay in arrival of the audio packets occurs, the buffered level goes downward. If the delay continues, it will cause underflow. Thus break or interruption of voice will occur due to the re-buffering operation. Then when the delay in arrival of the audio packets is over and the delayed packets arrive at a time, the buffered data amount will abruptly increase.
The initial buffering operation and the re-buffering operation should be carried out for a somewhat long time so that the buffering data amount never underflows again. In case of real time applications such as Voice over IP (VoIP) or IP video telephony, it is necessary to perform the re-buffering operation for a period of one hundred milliseconds to several hundreds milliseconds in consideration of tradeoff between the resiliency against delay variation. Whereas in case of non-real time applications such as video streaming, a period of the re-buffering operation is in general set to several seconds in order to give a particular importance to stability.
However, in case of applications for voice communication such as VoIP or IP video telephony, break or interruption of voice for a period longer than one hundred milliseconds will be clearly recognized and deteriorate quality of audio communications. Therefore, in order to improve the audio quality in the audio playback system for receiving packets containing audio data and playing audio in the packets back, it will be necessary to shorten the period of break or interruption of voice due to the re-buffering.
As for known technique of voice buffering in voice information communication, International Publication No. WO 01/01614 A1 discloses a system for changing the delay on a communication link by adjusting relative positions of read and write pointers of a buffer during silent periods. However, this known technique cannot shorten the period of break or interruption of voice due to the re-buffering.