Telematics systems are systems that bring human-computer interfaces to vehicular environments. Conventional computer interfaces use some combination of keyboards, keypads, point and click techniques and touch screen displays. These conventional interface techniques are generally not suitable for a vehicular environment, owing to the speed of interaction and the inherent danger and distraction. Therefore, speech interfaces are being adopted in many telematics applications.
However, creating a natural language speech interface that is suitable for use in the vehicular environment has proved difficult. A general-purpose telematics system must accommodate commands and queries from a wide range of domains and from many users with diverse preferences and needs. Further, multiple vehicle occupants may want to use such systems, often simultaneously. Finally, most vehicle environments are relatively noisy, making accurate speech recognition inherently difficult.
Human retrieval of both local and network hosted online information and processing of commands in a natural manner remains a difficult problem in any environment, especially onboard vehicles. Cognitive research on human interaction shows that a person asking a question or giving a command typically relies heavily on context and the domain knowledge of the person answering. On the other hand, machine-based queries of documents and databases and execution of commands must be highly structured and are not inherently natural to the human user. Thus, human questions and commands and machine processing of queries are fundamentally incompatible. Yet the ability to allow a person to make natural language speech-based queries remains a desirable goal.
Much work covering multiple methods has been done in the fields of natural language processing and speech recognition. Speech recognition has steadily improved in accuracy and today is successfully used in a wide range of applications. Natural language processing has previously been applied to the parsing of speech queries. Yet, no system developed provides a complete environment for users to make natural language speech queries or commands and receive natural sounding responses in a vehicular environment. There remain a number of significant barriers to creation of a complete natural language speech-based query and response environment.
The fact that most natural language queries and commands are incomplete in their definition is a significant barrier to natural human query-response interaction. Further, some questions can only be interpreted in the context of previous questions, knowledge of the domain, or the user's history of interests and preferences. Thus, some natural language questions and commands may not be easily transformed to machine processable form. Compounding this problem, many natural language questions may be ambiguous or subjective. In these cases, the formation of a machine processable query and returning of a natural language response is difficult at best.
Even once a question is asked, parsed and interpreted machine processable queries and commands must be formulated. Depending on the nature of the question, there may not be a simple set of queries returning an adequate response. Several queries may need to be initiated and even these queries may need to be chained or concatenated, to achieve a complete result. Further, no single available source may include the entire set of results required. Thus multiple queries, perhaps with several parts, need to be made to multiple data sources, which can be both local or on a network. Not all of these sources and queries will return useful results or any results at all. In a mobile or vehicular environment, the use of wireless communications compounds the chances that queries will not complete or return useful results. Useful results that are returned are often embedded in other information, and from which they may need to be extracted. For example, a few key words or numbers often need to be “scraped” from a larger amount of other information in a text string, table, list, page or other information. At the same time, other extraneous information such as graphics or pictures needs to be removed to process the response in speech. In any case, the multiple results must be evaluated and combined to form the best possible answer, even in the case where some queries do not return useful results or fail entirely. In cases where the question is ambiguous or the result inherently subjective, determining the best result to present is a complex process. Finally, to maintain a natural interaction, responses need to be returned rapidly to the user. Managing and evaluating complex and uncertain queries while maintaining real-time performance is a significant challenge.
These and other drawbacks exist in existing systems.