(1) Field of the Invention
The present invention relates generally to systems and methods for transcribing speech and, more particularly, to an automatic system that may be utilized to phonetically transcribe speech in one or more languages.
(2) Description of the Prior Art
The use of the stenograph or other shorthand systems for transcribing the discussions of important conferences, legal hearings, or governmental meetings, presently requires the services of highly trained and experienced personnel. As well, audio recording devices are often utilized for backing up the personnel to prevent loss of data. If more than one language is being spoken, the transcribing of all the languages requires the additional use of people trained in the transcription of each language used. In the multi-cultural, multi-lingual environment of the United States, the problem of transcribing the conversations of important events or meetings can literally require an army of highly skilled stenographically trained people. As more and more countries become international in their business, legal, military and other matters, the situation will not improve.
While automatic speech recognition systems are well known in the prior art, such systems require a fast computer and a highly complex program for every different language being transcribed. Speech recognition software and systems must solve the extremely difficult and complex problem of speech recognition to provide the transcription of speech to typewritten text. For the English language alone, a speech recognition program must be able to recognize hundreds of thousands of words, many of which are pronounced identically but spelled differently and have different meanings. Due to the complexity of language, speech recognition programs tend to produce many errors. The use of speech recognition systems for automatic transcription such as the complex computer based speech recognition programs is made even more complex if multiple languages need transcription. Due to the complexity of most languages, each speech recognition program must be specifically tailored to a particular language. Even then, the error rates of transcription can be quite high. Therefore, it would be desirable to provide a system that does not depend on the solution to the difficult and complex problem of speech recognition to provide the transcription of speech to typewritten text.
Various inventors have attempted to solve related problems to those discussed above as evidenced by the following patents:
U.S. Pat. No. 6,219,646 B1, issued Apr. 17, 2001, to Julius Cherny, discloses methods and apparatus for performing translations between different languages. The invention includes a translation system that performs a translation having increased accuracy by providing a three-dimensional topical dual-language database. The topical database includes a set of source-to-target language translations for each topic that the database is being used for. In one embodiment, a user first selects the topic of conversation, and then words spoken into a telephone are translated and produced as synthesized voice signals from another telephone so that a near real-time conversation may be had between two people speaking different languages. An additional feature of the present invention is the addition of a computer terminal that displays the input and output phrases so that either user may edit the input phrases, or indicate that the translation was ambiguous and request a rephrasing of the material.
U.S. Pat. No. 6,212,497 B1, issued Apr. 3, 2001, to Araki et al., discloses a word processor which comprises: a voice inputting device for inputting spoken word and converting the spoken word into voice data; a voice storage device for storing the voice data; a speech recognition device for recognizing a word in the voice data output from the voice inputting device or the voice data stored by the voice storage device; a display for displaying a result obtained by the voice recognition device; an instruction inputting device for inputting an instruction to select a portion in the result; and a correction device for correcting the portion in the result according to the instruction from the instruction inputting device.
U.S. Pat. No. 6,148,105, issued Nov. 14, 2000, to Wakisaka et al., discloses a study system of a voice recognizing and translating system with a sound data base for storing data from which noise is removed; a sound analysis unit for extracting the features of the voice corresponding to the voice data stored in the sound data base; and a model learning unit for creating an acoustic model on the basis of the analysis result of the sound analysis unit. A recognition system of the voice recognizing and translating system is provided with: an acoustic model storing unit for storing acoustic models; a second sound analysis unit for extracting the feature of the voice corresponding to the data concerned on the basis of the data obtained by removing the data representing noise from the voice data of a newly input voice, and a voice collating unit for collating the voice data obtained by the second sound analysis unit with the data of the acoustic models so as to recognize the voice.
U.S. Pat. No. 6,125,341, issued Sep. 26, 2000, to Raud et al., discloses a speech recognition system having multiple recognition vocabularies, and a method of selecting an optimal working vocabulary used by the system. Each vocabulary is particularly suited for recognizing speech in a particular language, or with a particular accent or dialect. The system prompts a speaker for an initial spoken response; receives the initial spoken response; compares the response to each of a set of possible responses in an initial speech recognition vocabulary to determine a response best matched in the initial vocabulary. A working speech recognition vocabulary is selected from a plurality of speech recognition vocabularies, based on the best matched response.
U.S. Pat. No. 6,122,614, issued Sep. 19, 2000, to Kahn et al., discloses a system for substantially automating transcription services for multiple voice users including a manual transcription station, a speech recognition program and a routing program. The system establishes a profile for each of the voice users containing a training status which is selected from the group of enrollment, training, automated and stop automation. When the system receives a voice dictation file from a current voice user based on the training status the system routes the voice dictation file to a manual transcription station and the speech recognition program. A human transcriptionist creates transcribed files for each received voice dictation files. The speech recognition program automatically creates a written text for each received voice dictation file if the training status of the current user is training or automated. A verbatim file is manually established if the training status of the current user is enrollment or training and the speech recognition program is trained with an acoustic model for the current user using the verbatim file and the voice dictation file if the training status of the current user is enrollment or training. The transcribed file is returned to the current user if the training status of the current user is enrollment or training or the written text is returned if the training status of the current user is automated. An apparatus and method is also disclosed for simplifying the manual establishment of the verbatim file. A method for substantially automating transcription services is also disclosed.
U.S. Pat. No. 5,917,944, issued Jun. 29, 1999, to Wakisaka et al., discloses a study system of a character recognizing and translating system with a character data base for storing character data representing characters contained in a sensed image; a character shape analysis unit for analyzing the shape of a character to extract the features of character constituting elements constituting the character; and, a mask learning unit for generating sample mask data of the character constituting elements on the basis of the analysis result of the character shape analysis unit. A recognition system of the character recognizing and translating system is provided with a collating unit for collating the character data of a character to be recognized with the sample mask data so as to recognize the character.
U.S. Pat. No. 5,835,854, issued Nov. 10, 1998, to Palisson et al., discloses an RDS/TMC receiver or a traffic guidance system including a unit for indicating on a display or by speech synthesis proper names or place names, for example, alternately in the language of the user and in the language of the country the user travels through, while the other words of the message are indicated only in the user's language. The translations are found in a memory. The guidance system may be used, for example, as a guiding and/or information system for the motorist.
U.S. Pat. No. 5,751,957, issued May 12, 1998, to Hiroya et al., discloses a multi-language compatible service offering/receiving system. A service server and a service client are connected to a translation rule managing server that is connected for managing translation rules for translating information expressing forms by way of an intermediate expression form. Upon sending of information from the service server to the service client, the service server translates a specific language contained in the data to be sent out into a language of the intermediate expression by referencing the translation rules. The service client translates the intermediate expression into specific expression by using the translation rules for displaying the data resulting from the translation. When the translation rules are unavailable in the service server and the service client, the translation rule is acquired from the translation rule managing server.
The above-described patents do not solve the problem of providing an automatic system capable of accurately transcribing and providing a written record of speech in one or more languages. Consequently, there remains a long felt but unsolved need for an improved automatic transcription system and method. Those skilled in the art will appreciate the present invention that addresses the above and other problems.