1. Field of the Invention
The present invention relates to the design of digital recording and playback systems. More specifically, the present invention pertains to the processing of voice and concurrent generation of corresponding text in a portable digital appliance.
2. Related Art
The use of portable digital recording and playback devices are quickly gaining popularity in business and among individual users. In particular, one attractive feature of digital recording is the possibility of converting the voice messages into text, which can then be reviewed, revised and incorporated into documents or otherwise retrieved for use subsequently. Today, there are several models of portable digital recorder in the marketplace. These prior art recorders typically record voice messages as compressed digital data. In order to convert the compressed digital data to text data, a separate computer program is generally required. Thus, in the prior art, subsequent to a recording session, the user needs to post-process the compressed digital data to perform the voice-to-text conversion. This requires additional processing time, and in some cases even requires the user to transfer the compressed digital data from the portable device to a personal computer (PC) having the necessary software program before the conversion can be performed. It is desirable to eliminate the extra step of post-recording conversion from compressed digital data to text data in a portable digital recording and playback system.
These prior art devices are not well-suited for generating text data from the recorded voice data for an additional reason. In order to achieve good conversion from voice to text, a high quality voice input to the voice to text conversion engine is needed. In prior art portable systems, the voice data is subject to high compression because portable systems typically have limited memory capacity, and high compression allows more voice data to be stored into the limited memory resources. Since voice data is stored in a highly compressed format in these portable prior art devices, the text data generated directly from the compressed voice data by a conversion program is usually unsatisfactory. As such, it is highly advantageous to have a portable digital recording and playback system which provides high quality conversion from voice to text.
Furthermore, portable devices are typically battery-powered. Thus, the need to conserve power is a major design consideration. As such, while a high capacity stager can potentially be used in a large, non-portable device deriving its power from a power outlet to improve the quality of the conversion from compressed voice data to text data, it is not a viable option in a portable device. Therefore, there exists a need for a portable digital recording and playback system which provides high quality conversion from voice to text and yet does not require a high rate of power consumption.
In implementing a viable portable digital recording and playback system, it is highly desirable that components that are well known in the art and are compatible with existing computer systems and other appliances be used so that the cost of realizing the portable digital recording and playback system is low. By so doing, the need to incur costly expenditures for retrofitting existing computer systems and other appliances or for building custom components is advantageously eliminated.
Thus, a need exists for a portable digital recording and playback system which does not require post-recording conversion to generate text data from compressed digital data. A further need exists for a portable digital recording and playback system which meets the above need and which provides high quality conversion from voice to text. Still another need exists for a portable digital recording and playback system which meets both of the above needs and which does not require a high level of power consumption. Yet another need exists for a portable digital recording and playback system which meets all of the above needs and which is conducive to use with existing computer systems and other appliances.
Accordingly, the present invention provides a portable digital recording and playback system which generates text data from voice without requiring post-recording conversion from compressed digital data to text data. The present invention further provides a portable digital recording and playback system which not only provides voice to text conversion without post-processing but the conversion is also of high quality. Embodiments of the present invention perform voice-to-text conversion using the high quality audio input signal rather than highly compressed voice data so that high quality conversion is achieved. Moreover, the present invention provides a portable digital recording and playback system which includes the above features and which conserves power for full battery operation. Furthermore, embodiments of the present invention utilize components that are well known in the art and are compatible with existing computer systems and other appliances, so that the present invention is conducive for use with existing computer systems and other appliances. These and other advantages of the present invention not specifically mentioned above will become clear within discussions of the present invention presented herein.
More specifically, in one embodiment of the present invention, a digital recording and playback system is provided. In this embodiment, the system comprises an audio capturing device configured to receive a voice input. The system also comprises a high compression encoder (HCE) coupled to the audio capturing device and configured to generate digital wave data corresponding to the voice input. The system further comprises a voice recognition engine (VRE) coupled to the audio capturing device and configured to generate text data corresponding to the voice input. Moreover, in this embodiment, the HCE and VRE are selectively coupled to a memory sub-system which is configured to store the digital wave data and the text data. In particular, in this embodiment, the HCE and the VRE are operable to concurrently generate the digital wave data and the text data in response to the voice input such that the digital wave data and the text data can be stored in the memory sub-system in a synchronized manner. Thus, in this embodiment, the present invention provides recording capability wherein text data is generated from a voice input without requiring post-recording conversion. In a specific embodiment, the present invention includes the above and wherein the system is battery-powered.
Additional embodiments of the present invention include the above and further comprise a decoder selectively coupled to the memory sub-system and configured to decode the digital wave data into decoded audio data, a digital-to-analog (D/A) converter coupled to the decoder and configured to convert the decoded audio data into an analog signal, and an audio output device coupled to the D/A converter and configured to generate a voice output corresponding to the voice input from the analog signal. Moreover, these embodiments also comprises a display sub-system selectively coupled to the memory sub-system and configured to display the text data. Thus, in these embodiments, the present invention provides simultaneous voice playback and text display.