1. Field of the Invention
The invention relates to a method for automatically responding to an inquiry from a user; the method comprising:
executing a machine-controlled human-machine dialogue to determine a plurality of pre-determined query items specifying information to be verbally presented to the user;
retrieving a plurality of information items from a storage in dependence on the query items;
generating at least one natural language phrase to present the obtained information items according to a presentation scenario; and
verbally presenting the generated phrase(s) to the user.
The invention further relates to a system for automatically responding to an inquiry from a user; the system comprising:
means for executing a machine-controlled human-machine dialogue to determine a plurality of pre-determined query items specifying information to be verbally presented to the user;
means for retrieving a plurality of information items from a storage in dependence on the query items;
means for generating at least one natural language phrase to present the obtained information items according to a presentation scenario; and
means for verbally presenting the generated phrase(s) to the user.
2. Description of the Related Art
Automatic inquiry systems, for instance for obtaining traveling information, increasingly use automatic human-machine dialogues. Typically, a user establishes a connection with the inquiry system using a telephone. In a free-speech dialogue between the machine and the user, the machine tries to establish a number of query items required to obtain the desired information from a storage, such as a database. During the dialogue, the system typically issues an initial question. For a travelling information system, such an initial question might be xe2x80x9cFrom which station to which station would you like to travel ?xe2x80x9d. During the following dialogue, the system uses speech recognition techniques to extract query items from the user""s utterances. For example, an automatic public transport information system needs to obtain the departure place, the destination place and a desired time/date of traveling to perform a query. The system may issue verifying/confirming statements to verify that it correctly recognized a query item. To establish query items for which no text has been recognized yet, the system may issue explicit questions, like xe2x80x9cWhen would you like to leave ?xe2x80x9d. The question may also be combined with a verifying statement. For example, in a situation where Amsterdam has been recognized as the departure place, but a departure time or specific departure station in Amsterdam is still unknown, the system could ask: xe2x80x9cWhen would you like to leave from Amsterdam Central Station?xe2x80x9d. Once all its essential query items have been recognized, the answers to the query are obtained from a storage. This results in a collection of information items, which are to be presented to the user in spoken form. Normally, textual presentations of the information items are inserted in a sequence of preformatted phrase or sentence templates, referred to as a presentation scenario. Usually, the templates are prosodically enriched. If no suitable speech presentation is available, for instance in the form of sampled phrases/words, the prosodically enriched text may be converted to speech using a speech synthesis technique. The speech output is fed back to the user. Depending on the nature of information items to be presented, a suitable presentation scenario may be chosen from a predetermined collection of scenarios. For instance, a different scenario may be used for any of the following situations:
no suitable connection was found,
one connection was found,
more than one connection was found,
the connection involves no changing over
the connection involves one change
the connection involves multiple changes.
Traditionally, inquiry systems were operated by human operators. The operator performed a dialogue with a user and entered query items into the system. The operator then performed a database query, and the results were displayed on a screen. The operator then read out the requested information to the user. The traditional inquiry systems tended to be self-centered or oriented towards the operator and lacking orientation towards the end-user. Automatic inquiry systems were build around these systems by adding a dialogue function and a presentation function. This sometimes exposed the lack of user-orientation of the system in the form of rigid or menu driven dialogue schemes.
From xe2x80x9cDialog in the RAILTEL Telephone-Based Systemxe2x80x9d, Proceedings ICSLP 1996, Vol.1, pp.55-553, it is known that it is desired to take the user""s intention into consideration during the human-machine dialogue. This may improve the recognition rate as well as confirmation process during the dialogue. It may also improve the selection process during the dialogue for systems wherein the user may choose between several sets of information items (e.g. information relating to a train or to a bus; main time table information, information relating to delays, etc) or may choose between several services, like presenting information or acting as an automated telephone switchboard.
It is an object of the invention to provide a method and system of the kind set forth, which is more user oriented, enhancing the acceptance of an automatically operating inquiry system.
To achieve the object, the method is characterized in that the method comprises, based on utterance(s) of the user, determining an intention of the user from a predetermined set of intentions; the intention reflecting a preferred way of presenting the information items; and selecting the presentation scenario from a predetermined set of presentation scenarios in dependence on the determined intention.
The inventor has realized that in many human-machine dialogues information is present, either explicitly or implicitly, that enables the system to determine or to infer certain intentions of the user that influence the way the information is to be presented most effectively. For instance, in many situations it is possible to derive from the dialogue, which of the information items retrieved during the query, are important to the user. In train travel information inquiries in general, the departure time will usually be of primary interest, allowing the user to determine when he should arrive at the departure station. However, if a user indicates that he wishes to travel from A to B and arrive around 9 A.M., it may be assumed that he is also very much interested in the exact arrival time at the station of destination. At least more so, when compared to users who ask for a connection from A to B that leaves around 9 A.M. Mentioning the arrival time early in the presentation will allow the user to check whether the arrival time of the connection indeed corresponds with his request, and to discard the incoming information if it does not. By choosing a presentation scenario which is tailored to the intention of the user, instead of selecting a scenario solely based on the information obtained as a result of the query, the user will feel more appreciated by the system. Consequently, the acceptance of the system will increase.
In systems which allow the user access to different sets of information or offer several services, the dialogue will normally involve determining in which set of information or in which service the user is interested. Such determining of the intention and acting upon it is not the subject of the invention. The invention relates to determining the intentions of the user that influence the way in which certain information items are presented effectively, and not to determining which set of information items or service the user wants to have access to.
In many cases it is not desired to pose additional questions to the user during the dialogue phase in order to determine intentions of the user that affect the way the information is presented most effectively. This will prolong the dialogue phase whereas at that moment it may not even be sure that the desired information is available at all. For instance, explicitly asking whether the user is most interested in the arrival time or departure time of a train/bus is probably annoying if the outcome of the query is that no train or bus is available regardless of the arrival/departure time.
In an embodiment as defined in the dependent claim 2, it is determined which of the information items is/are relatively important to the user. The user may explicitly express which item(s) is/are important. For instance, a user may say xe2x80x9cI would like to know the arrival time of a train to Y leaving X around 8 P.Mxe2x80x9d. From this utterance, it can be concluded that the arrival time is of most importance. The user may also implicitly indicate which item is important. For instance, from the utterance xe2x80x9cI have a meeting at 10 A.M. in Y. What is the most suitable train leaving from X?xe2x80x9d, it may be concluded that the exact arrival time is more important than the departure time. A presentation scenario is chosen which reflects the relative importance of the information items. In a preferred embodiment as defined in the dependent claim 4, more important items are presented relatively early in the presentation. For instance, if the departure time is most important, the output sequence could be xe2x80x9cThere""s a train (leaving X) at 7.15 P.M. It will arrive in Y at 7.55xe2x80x9d. If the arrival time is most important, the sentence could be xe2x80x9cThere""s a train arriving in Y at 7.55 P.M. It will leave X at 7.15xe2x80x9d. It will be appreciated that the most important item needs not be presented as the first information item in absolute terms. With the exception from discourse initial sentences, in coherent natural language discourse, sentence initial phrases often present information that is given. They serve to link the upcoming, new information with an information element that has been established in the preceding discourse. Other techniques for emphasizing an item may also be employed. For example, by using certain marked constructs or phrases, possibly in combination with sentence accent. E.g., when a user inquires whether he can get from A to B after 0.30 A.M. the system could answer: xe2x80x9cThere""s only a stopping train leaving at 0.35 AMxe2x80x9d to indicate the contrast with the intercity trains that are available earlier during the day.
According to an embodiment as defined in the dependent claim 3, an information item is regarded important if a corresponding query item occurred relatively early in the utterances of the user. For instance if the query item specifying the desired departure-time was recognized relatively early, the corresponding information item with the exact departure time is regarded as important.
In an embodiment as defined in the dependent claim 5, the intention is determined from a response to the user to question and/or verifier statements of the machine. The intention may be derived from the way or the moment in which the user specifies query items. The intention may also be recognized independently from the query items. For instance, the intention may be derived from items recognized in the response but not required for the query.
In an embodiment as defined in the dependent claim 6, the intention of the user is recognized analogous to recognizing the query items. Both the query items and the intentions (also referred to as intention items) are established recognizing the associated keywords/phrases in the utterance(s) of the user. The recognized query items are used to retrieve the desired information, e.g. by performing a database query. The recognized intention items are used to select the most appropriate presentation scenario. It will be understood that an overlap may exist between the query items and the intention items. This may be particularly the case if a query can be specified in various forms, e.g. for a valid query it may be required to recognize items for the departure station and the destination station, where a time may be specified by either a departure time or an arrival time. Which of the time items has been filled can be used to indicate the intention of the user. As such, the query items can contain information, which expresses an intention of the user that is relevant for the way the information is presented most effectively. If two or more intentions have been recognized, the most important one may be selected, for instance on the basis of a fixed priority scheme, where the intentions as reflected by the intention items are ranked with respect to importance.
In an embodiment as defined in the dependent claim 7, a default presentation scenario is selected if no intention has been recognized (i.e. no intention specifier has been filled).
In an embodiment as defined in the dependent claim 8, the user is allowed to barge-in during the presentation. From a barging-in utterance, like xe2x80x9cwhen does it arrive ?xe2x80x9d it becomes clear that the user is particularly interested in the arrival time. In response, a scenario emphasizing the arrival time may be chosen. The intention may also be derived from the moment of barging-in, possibly in combination with the barging-in utterance. Particularly, since it is difficult to recognize the beginning of a barging-in utterance, the moment of barging-in can help in determining the intention. For instance, from a reply xe2x80x9cNo, not thatxe2x80x9d in combination with the fact that just at that moment for the first time a change-over has been presented, it can be concluded that it is the intention of the user not to change over at all or not to receive detailed changing-over information. Similarly, if the moment of barging-in was after presenting a second or third change-over, it can be concluded that the user does not like changing frequently. It will be appreciated that the original presentation, which started at the end of the dialogue, may be based on a default presentation scenario. This may happen, for instance, if no intention or more suitable scenario could be determined from the initial responses of the user. If the user barges in during the presentation, the intention of the user for barging-in is determined. In many cases, a more suitable presentation scenario can be located to overcome any objections which may have been expressed. This may involve withholding some of the information which is available (e.g. only give detailed departure and arrival information and limited or no changing-over information). It will be appreciated that information which has been presented before the moment of barging-in need not, but may be, repeated according to the new scenario.
In an embodiment as defined in the dependent claim 9, based on the information items to be presented, a judgement is made whether a presentation scenario might give too complex/long output sentence(s). If so, the user is contacted in order to choose a more appropriate scenario. For instance, if the database query produced a trip with two change-overs, a combined response/question may be given to the user, like xe2x80x9cA train connection exists leaving A at 8.00 AM and arriving at B at 10.30 AM. The connection involves a change-over in C and D. Would you like the change-over times?xe2x80x9d. In dependence on the response, like xe2x80x9cYes, pleasexe2x80x9d, or xe2x80x9cYes, full details pleasexe2x80x9d, a choice may be made between presenting only the change-over times, presenting only departure times at the intermediate stations, or presenting details of each trajectory in full (departure/arrival station and time).
In an embodiment as defined in the dependent claim 10, a distinction is made between the intention of the user to write down or not to write down the information which is presented. Particularly when the user intends to write the output down, it is preferred to present the information in small sentences/phrases, at a relatively low pace and sufficient time in between the sentences/phrases. Particularly, if the information to be presented is of a repetitive nature, it is preferred to use phrases/sentences which allow a user to easily write the information down in table form. For example, if a trip involves changing-over, the trajectories are preferably presented in a short form, which is similar for each trajectory, like xe2x80x9cThe train leaves A at 8.00 P.M., and arrives in B at 8.30. The connecting train leaves B at 8.45 and arrives in C at 9.30xe2x80x9d. For certain queries it can safely be assumed that the user wants to write down the information. For example, if the user inquires after a journey which lies relatively far in the future (e.g. two weeks or more) or for international journeys.
In an embodiment as defined in the dependent claim 11, it is preferred for a travelling information system to distinguish between the intention of the user to arrive at a certain time, to depart at a certain time, or to have ample time for changing-over.
To meet the object of the invention, the system is characterized in that the system comprises means for determining an intention of the user from a predetermined set of intentions based on utterance(s) of the user; the intention reflecting a preferred way of presenting the information items; and means for selecting the presentation scenario from a predetermined set of presentation scenarios in dependence on the determined intention.