The processing power of modern computers has increased tremendously over the last thirty years. This has rendered implementable and commercially marketable systems for performing a number of applications which were, up to now, merely experimental in very large computers, but which were, nevertheless, unable to provide acceptable efficiency.
For instance, one may already know about machines made to translate text from one language (source) into text in another language (target). Such machines make foreign language speaking operators able to communicate with each other without having to go through the difficulty of studying foreign languages. Such a system has been described in European patent application 525470, "Method and System for Natural Language Translation", by Peter F. Brown et al published on Feb. 3, 1993. Another example deals with translating source information in a spoken form into target information in a printable or displayable or otherwise visually representable form.
The above are but few applications within a number of applications enabling not only easier communications between human beings, but also enabling man-to-machine communications, and vice-versa. For instance, a system for performing direct dictation and translating speech into text at affordable prices would be appreciated by a large public. The above applications involve automatic speech recognition and machine translation.
The initial approaches to both speech recognition and machine translation relied on "handwritten" rules. For instance, most speech recognition systems were built around a set of rules of syntax, semantics and acoustic phonetics. One had to discover a set of linguistic rules that can account for the vast complexity of language, and, then construct a coherent framework in which these rules could be assembled to recognize speech. This approach proved to contain insurmountable problems of writing down by hand a set of rules that covered the vast scope of natural language and constructing by hand the appropriate priorities, weighting factors and logical conditions, for performing the various selections leading to the target.
The approach fortunately switched to "statistical" techniques whereby rules are extracted automatically from large data bases of speech or text, and different types of linguistic operations are combined via probability theory. Basically, statistical approaches do exploit the fact that not all word sequences occur naturally with equal probability. Probabilistic models may then be constructed to be used later on, under normal operating conditions of the system.
These models have been shown to be useful for a number of different approaches such as those using language modeling to predict the next word from the previous ones, grammatical modeling to predict the next part of speech from previous parts, spell-to-sound rules to predict how a letter is pronounced depending on the context where it appears, allophone selection to predict how a phone is pronounced depending on the context where it appears, and morphological analysis to predict the part of speech from the spelling of the word, etc.
Obviously the problems, even if they now arise from a more rational consideration and therefore should lead to a more implementable system, are still complex. This should naturally impact on the final cost of the system.
In addition, one may easily understand from the above that the probabilistic approach to the problem is essentially context dependent. In other words, building a universal system is not feasible. A system made for business or commercial field of applications, for instance, would not be applicable to legal matters or to the medical field.
If one needs to switch from one field of application to another for any reason including the need to resell the system, or to expand the actual possibilities of the system involved, the operation for converting the system at reasonable cost would not be feasible unless the architecture of the system was made to accommodate the above requirements and goals.