1. Field of the Invention
The present invention relates to a question answering method, system, and program for answering a question which a user inputs by speech.
2. Description of the Related Art
A document retrieval technique which retrieves and presents a document matching a user's query is in widespread use.
Although document retrieval can satisfy a query such as “Tell me about hospital A”, it may not directly answer a question such as “What are the consultation hours of hospital A?” or “Where is hospital A?”. Document retrieval merely outputs a whole document or a passage in a document, but does not answer a question. A user as a questioner has to find an answer from the output result by himself or herself.
On the other hand, as a system which directly answers questions, a question answering system as described in, e.g., Jpn. Pat. Appln. KOKAI Publication No. 2002-132811 is known.
Demand has arisen for providing a speech-based, question answering system including a knowledge source for obtaining answers to questions. A practical example is a question answering system by which a question can be input by speech from a speech input device such as a microphone or cell phone, and an answer to the question can be generated and output by searching a speech database constructed on the basis of, e.g., voice memos stored in a recording device.
The conventional question answering systems exclusively use text databases, and retrieve answers to questions expressed by texts. Some question answering systems allow inputting of questions by speech via speech input devices, but their databases to be searched are text databases. When a question is given by speech, this question speech data is converted into a text by using a speech recognition system, and then an answer is retrieved. It is possible to directly output the answer as a text, or output the text as speech from a speech synthesizing device.
The present speech recognition technique may not convert speech data into text data at an accuracy of 100%. Therefore, it is highly likely that no right answer can be obtained from the conventional system if a speech recognition is wrong.
For example, when a question “What are the consultation hours of hospital A?” is input by speech, a speech recognition unit may correctly recognize words such as “hospital A”, “consultation”, and “hours”. These words are equivalent to query terms.
A specific proper noun such as “hospital A” is not generally registered in a speech recognition dictionary, so the possibility of wrong speech recognition of this portion is high. On the other hand, general nouns such as “consultation” and “hours” are generally highly likely registered in the dictionary, so speech recognition of these nouns is probably possible at high accuracy.
If the speech recognition accuracy is high, highly accurate answers can be obtained by retrieval of a text database. However, if the speech recognition accuracy is low, it is preferable to search a speech database, instead of the text database, by directly using speech feature parameter time-series data obtained from a speech waveform, because an answer may be obtained even if it may not be obtained by retrieval of the text database. High retrieval accuracy can be expected especially when a speech database, in which speech data input by a user himself or herself is registered, is to be retrieved by a question input by speech of the user himself or herself.