A primary objective of the present invention is to provide call center enterprises with highly effective automation to reduce costs without sacrificing the quality of service for the customer. Interactive automation should be a preferred measure of interaction by the customer, or motorist, to achieve tasks that could otherwise be handled through human/agent interaction through a call center. In the present invention, a service oriented architecture (SOA) is utilized to selectively leverage specialized speech recognizers in a uniquely adaptive fashion. The benefits of such an approach are to provide a safe and enjoyable user interface and to improve a call center's efficiency, as described herein.
The advent of telematics services, which were introduced over a decade ago, brought with it a trend to incorporate the ability of a vehicle to communicate with remote data centers and transmit location data and vehicle information related to safety, security, and emergency breakdown. “Telematics,” as it is referred to in the art, includes the integration of wireless communications, vehicle monitoring systems and location devices. Such technologies in automotive communications combine wireless voice and data capability for management of information and safety applications.
Most of the early telematics communication was achieved through wireless voice channels that were analog in nature. By law in 2008, all analog connectivity became digital and, consequently, data connectivity, such as “3G” technology, became a readily available measure for mobile devices to “connect” to the Internet. As a result of these advances, the vehicle is also being adapted to leverage data connectivity in combination with voice channel connectivity in what is referred to as the “connected car” concept.
The “connected car” concept has continued to evolve over the past few years and commercial launches of rather sophisticated vehicle services are becoming a reality. These services often rely on vehicle location and “on cloud computing,” defined as web services accessed over a data channel. Examples of these services include off-board routing, destination capture, remote-vehicle diagnostics, music downloads, traffic reporting, local searches, access to concierge services, connecting to a vehicle dealer, and roadside assistance. The term “off-board” as used herein refers to a location away from and outside the vehicle. The term “local search” as used herein refers to a point-of-interest (POI) search based on proximity to a specific location. The examples given above are regarded as being vehicle-centric in nature and many invoke some form of vocal communication with a live agent or an off-board interactive automation system.
Recently, a trend has emerged whereby motorists operate personal devices while in a vehicle, such as mobile devices, in a way that makes it unsafe while driving. Built-in user interfaces are now being added to the inside of vehicles to provide these mobile functionalities as a component of the vehicle itself. However, a number of concerns about the safety and practicality of these built-in components still exist. It is difficult to enable personal device functionality in a vehicle in a way that makes it safe while driving. The user interfaces are not at all practical for a vehicle driver to use while driving. Not only are the screens of the devices rather small, but, more significantly, the primary input modalities to operate and use a typical mobile device include some form of typing or mechanical interaction by the user with the device. Driver distraction can occur when a driver's cognitive processing is allocated to any task that is not focused on driving a vehicle safely. Making phone calls and entering data into mobile devices are examples of tasks that can be highly distractive while driving. Conventional typing while driving is extremely dangerous because both vision and touch are involved, making it impractical to drive safely. For example, while driving a car, it does not make sense to type a message by twisting and nudging a knob until each target letter is highlighted, followed by a push of the knob (“knobbing”). However, even though it is a very awkward experience, there are cases for which “knobbing” is the only way to enter a destination into a vehicle navigation system. To reduce safety problems, some existing built-in systems attempt to purposefully limit the use of the interface only when the vehicle is stationary. Unfortunately, this stationary requirement adversely compromises the range of capabilities that may be possible with in-vehicle systems.
Accordingly, it would be beneficial to use effective speech interfaces that limit, or completely eliminate, the need for the motorist to use his or her hands to operate the interface. In addition to navigation and dialing of telephone numbers, other applications such as browsing and texting could also benefit from using speech-enabled typing. Thus, speech recognition can play a critical role in enabling personal device functionality inside a vehicle. As a result, effective multi-modal interfaces are needed that are simple and safe to use under driving conditions.
Still, implementing speech-enabled functionalities in an environment inside a vehicle presents a unique and difficult challenge. For example, the microphone must be hands free and therefore, may be at a distance from the speaker's mouth. Also, road noise can be harsh and non-stationary. Furthermore, there may be multiple people inside of the vehicle who are also talking, thereby making it difficult for the system to decipher the speech of one person among several different voices. Because the vehicle presents such a difficult speech recognition environment, a considerable amount of speech recognition optimization is required to achieve reasonable speech recognition performance.
A need exists to overcome the problems with the prior art as discussed above. In essence, what is needed is a speech recognition engine that is capable of complex speech tasks in a harsh environment. In addition, it would be beneficial to provide a practical system and method for an enterprise to design its speech-enabled applications, host the applications, and maintain the applications without the need for in-house expertise to support advanced speech recognition.
Furthermore, effective multi-modal interfaces are needed that are simple and safe to use under driving conditions. Unless effective speech interfaces are available, enabling personal device functionality in the vehicle will not be safe while driving. Accordingly, it would be beneficial to provide a human-to-machine, in-vehicle interface that enables safely completing a text input task while driving a vehicle.