The present invention relates to the field of transcoding of voice data. Transcoding is the process of converting one format of the voice data into another.
Internet telephony, known as voice over IP (VOIP) is becoming a realistic, cost effective alternative to the traditional public switched telephone networks (PSTNs).
In general, most VOIP applications use a voice encoding format that is different from the voice encoding format used by PSTN networks. Because of the different voice formats used, many of the functionalities that exist for PSTN are not available to VOIP applications unless the functionality is built directly into the VOIP application.
Voice converters and transcoders (VCs) that convert voice data from one format to another are known and can be used to convert data supplied by a VOIP application to PSTN format to allow the VOIP application to utilize PSTN functionalities such as automatic speech recognition (ASR) and text to speech conversion (TTS). The VC can also convert the output of a PSTN functionality to VOIP format. Existing VCs provide such a service by using dedicated DSP resources. A dedicated DSP resource is an entity that is allocated to the voice channel at the very beginning of a process and remains allocated as long as the channel is in use. The DSP resources used to perform transcode operation are full duplex. Both PSTN as well as VOIP networks are also full duplex in nature. Hence to handle a full duplex network a full duplex DSP resource was created, dedicated and used.
Although almost all of the DSP resources are full duplex in nature, most human interaction is half-duplex in the nature and most of the applications operate based on this half-duplex interaction. For example, almost all of the users who use telecommunication applications such as voice mail and informational services applications do not talk as well as listen at the same time. Accordingly, it is not necessary for an application to dedicate and allocate DSP resources for the entire duration of the application.
Based on the above observations, all of these existing applications use the DSP resource to less than 50% of the their capabilities. From this it is evident that a improved voice converter that can utilize the DSP resources more effectively and efficiently is required. Further more such a system must be capable of handling huge number of subscribers.
By virtue of this invention it is now possible to economically mix and match various functional components from VOIP and PSTN networks.
According to one aspect of the present invention, method and a mechanism allows transcoding and scheduling two independent voice data streams from two distinct and different subscriber on to the same full duplex DSP resource.
In one embodiment the voice converter waits for a request for conversion resource on TCP/IP. Based on the type of transcoding that was requested it will allocate a half-duplex resource, perform a transcode operation, and send output data over a UDP interface.
According to another aspect of the invention, look-ahead buffers are utilized to mask network latency and provide a continuous stream of data to the DSP resources.
According to another aspect of the invention, data is transferred in packets having session numbers. The session numbers are utilized to identify different data streams using a single DSP resource.
A further understanding of the nature and advantages of the invention herein may be realized by reference to the remaining portions of the specification and the attached drawings.