The present invention relates to mechanical language translation of various languages and more specifically relates to a digital language semantic database for machines.
Currently, use of mechanical language translation is based on character database (GB2312) and word database (GB1375). Dictionaries are the sources of the main contents of a word database. Phonetic sound codes, visual form codes and sound-form codes used during an encoding procedure are all the characteristics of dictionaries. Dictionaries have always been serving humans for the purposes of referencing, making judgment and selective learning. When mechanical language translation is based on dictionaries, machines will be required to think and judge as humans. Such requirements are currently met by setting up various technical parameters, identification modules and vector modules. However, as language is a fairly complicated discipline which experts describe as something that could not be put in order, it is very difficult to solve all the problems via technical means such as semantic meaning trees, real parameter pruning and virtual parameter pruning. As a result, one can imagine the quality of a mechanical translated text.
Owing to the limited intelligence of machines, the main purpose of processing natural language signals is to enable machines to read and understand natural languages of humans, in other words, to enable machines to simulate the language mechanisms of humans. At this present stage, it is extremely unrealistic to expect machines to be as intelligent as humans.
The common technology at present comprises extracting a source text to be compared and segmented against a word database, spreading the identified word according to a word-formation semantic unit denotation database (tree) for semantic analysis and pruning and finally selecting the eventually determined semantic meaning. This technology is called semantic translation. A description according to patent application number 200310011433.X is quoted herein as follows: “Extract a sentence from the source text: analyze the sentence by using a semantic unit denotation database (tree) to obtain the semantic expression of the sentence, spread the semantic expression of the sentence according to the semantic unit denotation database using the expression of the target language and then output the spread sentence as the translated text.” (Line 20 to 23 on page 1 of the description). The above description discloses the common method currently adopted by all kinds of language translation.
It is known that a language is formed by words and a word is constituted by phonetic sound, visual form and semantic meaning. Different languages are characterized by their own phonetic sounds and visual forms while semantic meanings of different languages are shared in common. Only because of semantic meanings could there be intercommunication between different languages. If semantic meanings are stored in a machine, any language could be completely formed by integrating the semantic meanings with its own phonetic sounds and visual forms.
Even though semantic meanings alone are freely interchangeable between different languages, this is far from enough for the purpose of translation as it is also necessary to accommodate the language habits of different languages. Accordingly, adjustment in syntactic relationships between different languages has to be made. In order to establish syntactic relationships, part-of-speech characteristics, semantic characteristics and also language context of each word are required. In the absence of the foregoing, syntactic relationship could not be established.
In view of the above, the present invention provides an integrated solution.