1. Field of the Invention
Embodiments of the invention generally relate to automated recognition of the meanings of natural-language sentences and language translation.
2. Description of the Related Art
The acquired ability to understand, speak, and write one or more languages is an integral part of human development to interact and communicate within a society. Various language analysis approaches have been used to dissect a given language, analyze its linguistic structure in order to understand the meanings of a word, a sentence in the given language, extract information from the word, the sentence, and, if necessary, translate into another language.
Prior language analysis systems with a semantic component usually are created for a very restricted area of application, for example, medical diagnostics or ticket sales/reservation. In these analysis systems, only simple sentence patterns with restricted syntax and semantics are used. In addition, syntactic descriptions in general are not linked with the semantic descriptions. Other machine translation systems, both rule-based and statistics-based, concentrate on proper transfer of language information and usually make no use of any full-fledged intermediary data structures which explicate the meaning of the sentence being translated.
Certain theoretical concepts, such as Parallel Correspondence Model, propose the idea of uniting and linking syntactical information with semantic information together. For example, the most developed of these theoretical concepts are Generalized Phrase Structure Grammar (GPSG), Head-Driven Phrase Structure Grammar (HPSG), and Lexical Function Grammar. However, most of them have not been put into usable algorithms for language analysis.
As a result, even though various models have been proposed, most of them perform poorly in analyzing complete sentences experimentally and do not have any noteworthy industrial application. In addition, complex sentences are often very long and contain various punctuation and symbols such that prior art parsers, language analysis programs, or machine translation systems often have difficulty returning a complete parse or translation on sentences beyond a certain level of complexity. It is especially true for complex texts, such as those found in technical texts, documentation, internet articles, journals, and the likes.
Further, the decision to remove ambiguous results or defer such actions during different stages of the language analysis and/or machine translation often complicates the analysis and translation itself, leading to a very low percentage of successful cases. Attempts to successfully analyze one language sentence and synthesize into another language all have the drawbacks of being very time-consuming and/or compatible or applicable only to specific languages.
Thus, there exists a need to analyze a sentence of a given language and construct a language independent structure/description so as to understand the meanings of the sentence and/or translate into another language.