Spoken language is the most natural and convenient communication tool for people, yet more and more tasks are done using some type of machine or computing device. Dialog systems are systems in which a person speaks to a computer or machine, in order to accomplish a result. With the rise of microprocessor-controlled appliances and equipment, dialog systems are becoming increasingly used to facilitate the man-machine interface in many applications such as computers, automobiles, industrial machinery, home appliances, automated telephone services, and so on. Natural language dialog systems allow a user to enter a request or query in an expression or manner familiar to the user without requiring the user to learn a particular language (or structure) dictated by the computer. Dialog systems process the query and access one or more databases to retrieve responses to the query or cause the computer or machine to perform a particular task.
In many spoken language interface applications, proper names, such as address names, city names, point of interest (POI) names, song names, person names, company names, file names, and so on, are widely used. With data storage capacities in such systems increasing rapidly, people tend to put more and more names into their storage in the form of databases. Accessing the data with spoken language offers people convenience and efficiency if the spoken interface is reliable. It is often the case that the number of proper names used in these applications is very large, and can include many foreign or hard to recognize names, such as street names in a navigation domain, or restaurant names in restaurant selection domain. Moreover, when used in highly-stress environments, such as driving a car, or operating machinery, as well as simply for their convenience, people tend to use more short-handed terms, such as partial proper names and their slight variations. Even in normal conversational contexts, once the identity of an object has been established, later references to that object are typically based on partial names or descriptors, rather than the full name. However, the use of partial names, nicknames, and so on to refer to named objects and people, i.e., any named entity, generally limits the accuracy or effectiveness of natural language processing systems.
In general, present recognition methods on large name lists focus strictly on the static aspect of the names. This is a very challenging problem especially for speech recognition because of the confusability in the large name lists. To add to this problem, in typical speech patterns found in both stressful and non-stressful contexts, name usage for an object often changes through the course of a dialog. In this case, present recognition methods do not adequately recognize different name references to the same object, or they must re-learn different versions of the same name. What is needed, therefore, is a name model generation and name recognition process for spoken interface applications that improves the speech recognition accuracy of names and partial names. What is further needed is a name recognition system that accommodates the dynamic nature of name references in normal speech patterns. With such a generated name model, it is expected to have a better accuracy for large name lists in speech recognition, language understanding, and other components in dialog systems, due to the weight values that put the focus on the names that are more likely to be discussed in the course of a dialog.