Speech recognition systems are systems that utilize a machine to identify words or phrases in spoken language and convert them into machine readable text or instructions. Early speech recognition applications included simple tasks such as voice dialing (e.g., “call home” for a phone), simple data entry (e.g., entering a credit card number or account number audibly), speech-to-text processing (e.g., word processors or emails). As speech-to-text recognition systems have become more advanced, the applications amenable to these systems has also advanced. For example, U.S. Pat. No. 9,318,108 B2 to Gruber et al. is directed to an intelligent automated assistant system that engages with the user in an integrated, conversational manner using natural language dialog, and invokes external services when appropriate to obtain information or perform various actions. Further systems have been developed whereby a device with speech recognition capabilities can control the functions of a variety of secondary devices around the home or at a commercial enterprise. For example, U.S. Pat. No. 9,698,999 B2 to Mutagi teaches systems and techniques for controlling a secondary device by natural language input using a primary speech-responsive device. The secondary device can be a device not traditionally considered “smart”, such as a desk lamp, which can be turned on and off by natural language inputs to the primary speech-responsive device.
Many businesses, including banks, retail stores and restaurants rely on verbal orders from customers. Slow, inaccurate or inefficient capture of verbal orders can frustrate customers and lead to lower sales. This is especially true in fast food restaurants. Fast food restaurants, or quick serve restaurants, are restaurants that specialize in food that can be prepared and served quickly. While many of these types of restaurants have placed an increased focus on the quality of the food served, a principal focus remains serving the customer quickly and accurately for the convenience of the customer. The process begins when the customer engages the restaurant's order processing system, be that an employee taking an order or, more recently, interacting with a touch screen system or other touch-activated physical interface to make order selections. Streamlining the order processing system can dramatically enhance the speed of the over-all process and enhance customer satisfaction, while driving down labor-associated expenditures.
Depending on the level of customer traffic, a delay can often result when the restaurant employees are busy fulfilling other service tasks, such as collecting payment and delivering food. This delay can be significantly frustrating to customers wishing to place an order. In addition, significant amounts of time can be devoted to receiving orders by restaurant employees, which can limit their productivity in other areas of their job function. Moreover, the intense time demands on restaurant workers can lead to less pleasant interactions with customers, which can be critically impact the first impression that is created with the ordering process. And the time constraints may lead to missed opportunities for additional sales, such as through the recommendation of complimentary or new products that are available for purchase.
Attempts have been made to develop speech-based natural language ordering systems. These systems have numerous limitations that have reduced their acceptance by customers. For example, these systems have had a limited vocabulary and are poor at recognizing words spoken at different speeds or with accents in a manner analogous to that of human capability. They are also poor at recognizing tone, such as tone that could detect growing frustration with the process requiring human intervention. Moreover, these systems often fail to make key associations between products ordered and miss the opportunity to sell or ‘up-sell’ additional items. This can lead to an unwillingness to adopt a system due to concerns over lost revenue opportunities. The present invention overcomes these short-comings as will become apparent in the foregoing description of the artificially-intelligent, natural language order processing system as taught herein.