In recent years, for example, in a system by which a user purchases goods, operates a device, or the like, there has been used a speech (hereafter “voice”) recognition device controller for recognizing a voice input from the user and acquiring information necessary for purchasing goods, operating the device, or the like. This kind of the voice recognition device controller interacts with the user by recognizing a voice (speech) input from the user, responding (outputting a voice guide) to the user based on the recognized result to prompt the user for the next speech. Thereafter, the voice recognition device controller acquires necessary information for goods purchase, device operation, or the like from the recognized result of the interaction with the user.
If a voice guide or interactive flow is fixed independently of a user or a user's situation in this kind of voice recognition device controller, an efficient interaction cannot be achieved in some cases. For example, a voice guide tailored to a user, who is unfamiliar with speech to the voice recognition device controller, is redundant for a user familiar with speech to the voice recognition device controller. Therefore, the system using the voice recognition device controller fixed to this type of voice guide is inconvenient to use for the user familiar with speech. Accordingly, there is already suggested a voice recognition device controller for determining a user's learning level in speech and changing a response to the user based on a determination result (refer to, for example, Japanese Patent Laid-Open No. 2000-194386 (hereinafter, referred to as Patent Document 1)).
A voice recognition/response system, which is the voice recognition device controller in the Patent Document 1, recognizes speech input from a user via a telephone and responds to the user. If the voice recognition/response system is applied to, for example, a telephone-based airline reservation system, it acquires the airport names of departure and destination, the date and time of departure, and the like from a telephone interaction with a user by voice recognition.
In this case, the voice recognition/response system includes a learning level determination unit for determining a user's learning level in speech and a speech control unit for controlling an interactive flow (the content of a voice guide and a rate of speech of the voice guide) with the user based on the determination result of the learning level determination unit. The learning level determination unit determines that the learning level is higher as time A and time T become shorter and the number of speech sounds N becomes lower and that the learning level is lower as the time A and the time T become longer and the number of speech sounds N becomes greater, where A is a time period from the start of outputting the voice guide to the start of the user's speech, T is a time period for the user's speech, and N is the number of user's speech sounds (the number of user's speech words). Then, the speech control unit gives a brief and short voice guide at a high speed if the user's learning level is relatively high and gives a detailed voice guide at a low speed if the user's learning level is relatively low by using the content of the voice guide and the rate of speech determined based on the determination result of the learning level determination unit.
The speech tendency such as the speech time and the number of speech sounds also depends upon the user's individual preferences. Therefore, a long speech time and a great number of speech sounds do not necessarily imply a low learning level in speech. For example, even if the rate of speech is low, the user's learning level in speech can be considered to be high in the case where all of necessary information is input without fail. Therefore, if the user's learning level in speech is determined based on the speech time or the number of speech sounds as in the voice recognition/response system disclosed in the Patent Document 1, the learning level cannot be properly determined and it may lead to a problem of causing an inefficient interaction in some cases.