1. Field of the Invention
The present invention relates to an IP (Internet Protocol) telephony system, a VoIP (Voice over Internet Protocol) terminal, and a method and a program for reproducing a hold sound or an audible sound used in the IP telephony system and the VoIP terminal. The present invention especially relates to improvement in the method for reproducing a hold sound or an audible sound, by which the hold sound or audible sound is accumulated in the VoIP terminal in a payload format of an RTP (Real-time Transport Protocol) packet and is reproduced. The VoIP terminal includes a media gateway, a media converter, an IP telephone and the like, which are call controlled by a multimedia gateway controller of the IP telephony system through the Internet, an intranet, and a LAN (Local Area Network), respectively.
2. Description of the Related Art
A conventional VoIP terminal such as a media gateway, a media converter, and an IP telephone in an IP telephony system has a DSP (Digital Signal Processor) for converting an RTP packet from an IP network into a PCM (Pulse Code Modulation) signal, a hold sound source and an audible sound source, a sound source data selection function for selecting data to be reproduced from data stored in each sound source, a CODEC for converting the selected data into a PCM signal, and a selector function for selecting the PCM output from the DSP or that from the CODEC in accordance with which of voice and a hold sound or an audible sound is reproduced. When a hold sound and an audible sound are stored in a PCM signal format, the CODEC may be unnecessary (Japanese Patent Laid-Open Publication No. 2000-59471 and the like).
A method for reproducing a hold sound or an audible sound in the VoIP terminal of the IP telephony system will be described with reference to FIGS. 1 and 2. FIG. 1 shows the structure of the conventional VoIP terminal, and FIG. 2 shows the operation of the conventional VoIP terminal.
Referring to FIG. 1, a multimedia gateway controller (MGC) 1 at least comprises a main processor 11, a memory 13, and a LAN (local area network, such as Ethernet (R)) IF (interface) 12.
A VoIP terminal 8 at least comprises a LAN-IF 81, a CPU 82, a memory 83, a call control function 84, a jitter buffer control function 85, a DSP control function 86, a selector control function 87, a hold sound or audible sound source data selection function 88, a jitter buffer 89, a DSP 90, a hold sound source and audible sound source 91, a CODEC 92, and a selector 93. The multimedia gateway controller 1 is connected to the VoIP terminal 8 through a LAN 100.
Then, the operation of the method for reproducing a hold sound or an audible sound in the conventional VoIP terminal 8 will be described with reference to FIG. 2. By way of example, a flow in a case where the VoIP terminal 8, which has already carried out voice communication, sends out the hold sound will be described.
The CPU 82 of the VoIP terminal 8 inputs voice RTP packets, which are inputted through the LAN-IF 81, into the jitter buffer 89 by use of the call control function 84, the jitter buffer control function 85, and the DSP control function 86 (step S81 of FIG. 2). The jitter buffer 89 absorbs the delay of a network, and then writes the RTP packets into the DSP 90 at regular intervals (for example, intervals of 10 ms). A voice PCM signal outputted from the DSP 90 is inputted into the selector 93 (a state of voice communication) (step S82 of FIG. 2).
When the call control function 84 operates, the CPU 82 determines to perform voice communication, and controls the selector 93 so as to select the voice PCM signal from the DSP 90 as PCM signal output from the selector 93 by using the selector control function 87 (step S83 of FIG. 2).
In sending out the hold sound from this state, the CPU 82 selects designated hold sound data from the hold sound source and audible sound source 91 by use of the hold sound and audible sound source data selection function 88 in order to input the hold sound data into the CODEC 92. The CODEC 92 inputs the selected hold sound data into the selector 93 as a hold sound PCM signal (steps S84 and S85 in FIG. 2).
When the call control function 84 operates and the CPU 82 determines to send out the hold sound, the CPU 82 controls the selector 93 so as to select the hold sound PCM signal as the PCM signal output from the selector 93 by using the selector control function 87 (a state of hold sound reproduction and sending) (steps S86 to S88 of FIG. 2).
In the conventional method for reproducing the hold sound or audible sound, the hold sound is reproduced in the VoIP terminal 8 by the structure and the operation as described above. In the conventional method for reproducing the hold sound or audible sound, the reproduction of the audible sound is also carried out in similar operation to above.
According to the foregoing method for reproducing the hold sound or audible sound in the conventional VoIP terminal, in addition to the DSP for converting the RTP packet into the PCM signal, the CODEC for converting a signal from the sound source of the hold sound and audible sound into the PCM signal, and the RTP/hold sound or audible sound selector for selecting from among the PCM signal such as voice outputted from the DSP and the PCM signal outputted from the CODEC or the sound source of the hold sound and audible sound are necessary. The CODEC and the selector cause increase in cost.
As a method for holding the sound source of the hold sound and the audible sound, a non-volatile memory or a specialized LSI (Large-Scale Integration) is often used. In this case, however, it is difficult to arbitrarily copy a hold sound or an audible sound, which is differently required from user to user or from country to country, from a download server to the VoIP terminal for use. Therefore, there is a problem that the conventional method cannot flexibly meet user's needs.
In a method for holding the sound source of the hold sound or the audible sound with the use of a volatile memory, on the other hand, it is necessary to provide a memory for storing the sound source of the hold sound and the audible sound separately from a memory for executing a program of the VoIP terminal, and hence there is a problem of increase in cost.