During a voice call it is often difficult to understand the other party due to a noisy environment, especially when using mobile equipment in public transport or at public places. Often the only option is to repeat a phrase, increase the voice volume or move to a quieter location. Not understanding the speaking partner may cause that it is not possible anymore to follow the conversation or it is not possible anymore to comment further-on during that call. Misunderstanding the speaking partner may have severe consequences.
Currently, it is possible to translate voice in real time into text and trigger some defined action. An example of this is the SIRI (Speech Interpretation and Recognition Interface) application on Apple iPhone, or the built-in voice control application of a Windows computer.
These allow sampling of a voice command, translating it to text, deriving the meaning of the text, and finally trigger an action on the device. Still, however, it is not possible to follow a phone conversation additionally as written dialog on the device display via a network service.
So a problem with existing solutions for text-to-speech translation is that the service is application or operating system, device and manufacturer dependent. An integrated IMS (IP Multimedia System) service to display any form of transcription is missing in current telephony service offerings.
In telecommunications networks, e.g., in cellular networks as specified by 3GPP (3rd Generation Partnership Project), communication services may be provided on the basis of Internet Protocol (IP) transport channels to a user equipment (UE). One example of such communication services is a voice call established through infrastructure of the network referred to as IP Multimedia Subsystem (IMS). In this case, an IMS node referred to as Proxy Call Session Control Function (P-CSCF) may interact with IP based transport infrastructure of the network, e.g., referred to as Evolved Packet Core (EPC) so as to provide IP based bearers for carrying user plane traffic of the voice call to or from the UE. As for example defined in 3GPP Technical Report 21.905, such bearers may be regarded as an information transmission path having defined characteristics, such as capacity, delay, bit error rate, or the like. Other IP based communication services which may be provided through the IMS are voice call services, video call services, chat services, and mobile TV services.
Accordingly, there is clearly a need for a network based technique which allows for transcribing of a communication session in a communication network.