It is extremely costly and time consuming to build spoken dialog systems for the same task in multiple different languages due to the required target language expertise and the data needed to build such applications. In addition, user contact centers for large corporations are often distributed across several different geographic locations to handle users that speak different languages, and any one contact center in any single country typically is not able to handle many different languages. It is, however, very costly to have all these contact centers that provide essentially the same service in different languages and in different countries.
Current state-of-the-art spoken dialog systems operate along the following path. A user calling a help desk in the United States typically will first enter a spoken dialog system (human-machine dialog) in English. In some cases there may be support for Spanish but you can't talk to the machine in Chinese, Turkish, etc. . . . in the U.S.) To talk to a human agent (human-human dialog) in the middle of the human-machine dialog, then the user will talk to an English speaking agent. If a user calls the same company's helpdesk in France, they will reach a spoken dialog system built in French, and if they decide to talk to a human agent at any point in the dialog, they will speak to an agent who speaks French.
This process means that there is a separate dialog system developed for each language for the same task and there also is a separate user contact center for each language/country. There is a huge cost associated with building the same spoken dialog system (human-machine dialog) for each language and keeping separate contact centers for each country/language.