1. Field of the Invention
The present invention relates to a method and system for recognizing speech for searching a database, where a search request can be entered by voice.
2. Related Art
Conventional speech recognition systems are being used for a range of applications. Speech recognition is used to allow a user to enter a voice command for controlling an operation of a device, such as a telephone or a navigation system. Conventional speech recognition systems also include speech recognition software that can be run on a computer system, such as Dragon® NaturallySpeaking (which is a registered trademark of Nuance Communications, Inc.), which is software used to dictate text or to control functions of the computer. As such, the program can be used to navigate a web browser by speech, and accordingly, a search query may be entered into the web browser by speech. In this manner, a database of a search engine may be searched by speech input. Yet, there are several problems using conventional speech recognition systems to search databases by voice.
Conventional speech recognition systems only recognize words that are included in their vocabulary. The vocabulary is generally rather limited, e.g., some 10,000 words on conventional systems, and up to 500,000 words on advanced systems. Yet, larger vocabularies require more time to search the vocabulary, and more resources are required in the form of memory and processing power. Besides being limited in size, vocabularies often lack specialized expressions, names of places and persons, and other words not used in the colloquial language. As a result, many entries that may be found in a database cannot be accessed by voice using a conventional speech recognition system, as these entries are not included in the vocabulary of such a system. Examples of such databases are a collection of music files, of which artists or song titles are not comprised in the vocabulary, points of interest stored in a navigation system, addresses and names of persons in a phone book or in an Internet database, and the like. If the user wants to search for a word in the database not included in the vocabulary, the speech recognition system will either select a word from its vocabulary most closely resembling to the word, or simply ignore the word. As a result, conventional speech recognition systems provide very limited possibilities of searching databases.
If the structure of the database is known, e.g., in a music collection, where song titles and artists are known to the system, this data may be prepared for speech recognition. Yet, this process is often very time consuming and not efficient for large databases. The structure of other databases is unknown, such as a database accessed over the Internet, and accordingly, their entries cannot be added to the vocabulary of a speech recognition system.
Further problems may arrive from multilingual search requests, e.g., when searching for song titles and artists or when searching for web pages. Furthermore, the database may contain orthographic alternatives of a word of the search request, or plural alternatives may exist for the pronunciation of a word of a database entry. These problems result in that the database entry that is requested by the user will not be found using a conventional speech recognition system. These problems may also occur in combination. A user may, for example, enter the German word “die” as a spoken search request; yet, the system may only deliver search results for the alternative of the English word “die”. Thus, conventional speech recognition systems do not enable a user to extensively search a database, particularly not if multilingual entries are included in the database.
Accordingly, a need exists to provide an improved method and system for searching a database by speech input.