1. Technical Field of the Invention
The present invention relates, in general, to improved IP-based communication and, particularly, to a system and method for providing speech-generated text to IP-based telephone communication.
2. Background and Objects of the Invention
There is currently a movement towards enhancing the capabilities of computer networks, such as the Internet, to support traditional telephony operations. The goal is to provide quality voice communications over the packetized Internet. This capability is often referred to a voice over Internet Protocol (VoIP). With the current Internet, for example, VoIP can provide acceptable voice communications at a greatly reduced cost to the user, when compared to traditional telephone tolls.
Recently, IP-based telephone systems have become capable of providing video along with voice communication. Standard personal computer (PC) plug-in hardware, such as Picturephone by 3com, and various software applications provided for Internet telephony, such as CU-SeeMe software by White Pine and software by Vocaltech and Microsoft, allow transport of both voice and image data across the Internet. In particular, such systems typically include a microphone, a speaker and a PC plug-in sound card for providing audio data to a user""s PC, and a video camera and a PC plug-in video capture card for providing video data thereto. Upon the establishment of a connection between two or more PCS over the Internet, the audio and video data generated in one PC is packetized and transported over the Internet for display on the other PC, such as within a browser framework. In this way, PC users may view each other while simultaneously speaking to each other.
Existing IP-based telephone systems having video communication capability typically allow a PC user to communicate text to the other PC user in communication therewith. Text entered via the keyboard of a first PC may be transported with the audio and video data over the Internet and displayed as text in a browser window on another PC in communication with the first PC.
These IP-based telephone systems are not without shortcomings. For instance, VoIP usually provides a markedly reduced quality of service relative to conventional long distance telephone services. Poorer voice quality, intermittent fading, and other interruptions are commonly encountered, especially during international calls.
In response to periods of reduced quality of service, callers utilizing IP-based telephone systems frustratingly resort to communicating with typed text instead of attempting to communicate with voice. To further compound the problem, typed text data is transported at a slower rate than voice data. As a result, it is oftentimes quite difficult to engage in communication when one caller is providing voice data and the other caller is providing typed text data, due to the data transmission of voice and typed text being out of synch.
In the context of IP-based telephone systems having video communication capabilities, having to communicate with typed text during periods of reduced quality of voice communication may result in a relatively inexperienced typist being forced to look away from the video display when entering text, thereby reducing the value in being provided a real-time video display of the other caller. As a result, there is a need for an IP-based telephone system which addresses the inherent problems associated with VoIP telephone communication.
It is an object of the present invention to provide an IP-based telephone system which selectively automatically provides text during periods of substandard voice communication.
Another object of the present invention is to provide an IP-based telephone system which allows a caller to view received video data while concurrently transmitting video and speech-generated data.
The present invention overcomes the shortcomings in the above-identified systems and satisfies a significant need for an IP-based telephone system having enhanced and easily utilized communication features.
According to a first embodiment of the present invention, there is provided an improved IP-based telephone system. The telephone system includes a combination of hardware, software and/or firmware employed in association with a conventional PC in order to perform VoIP communication. A video capture card communicatively connected to a PC preferably receives video input data from a video source as well as from a VoIP transmission. A sound card communicatively connected to the PC preferably receives audio input from a microphone as well as from a VoIP transmission. A speech recognition device operatively associated with the PC preferably receives the microphone audio data, recognizes speech patterns therein and generates text data representing the recognized speech patterns. The generated text data is included with the microphone-provided audio data and the video camera-generated video data for transport over the Internet to another PC. In this way, a PC user may communicate video, voice and voice-generated text data to another PC user.
The present system preferably further includes an application which receives VoIP video data and speech-generated text data which were transmitted by another PC. In the event the other PC transmits a signal or set of signals having video data, audio/voice data and speech-generated text data, the application preferably displays the video data and produces audible signals from the audio/voice data using a speaker. In addition, the IP application preferably presents the speech-generated text concurrently with the displayed video data. In this way, the PC user is able to read the speech-generated text while viewing the video data during periods when the VoIP audio signal transmission falters.