In the context of the present disclosure, the term “neural network” designates a computer-implemented, artificial neural network. An overview of the theory, types and implementation details of neural networks is given e.g. in Bishop C. M., “Neural Networks for Pattern Recognition”, Oxford University Press, New York, 1995/2010; or Rey, G. D., Wender K. F., “Neurale Netze”, 2nd edition, Hans Huber, Hofgrefe A G, Bern, 2011.
The present invention particularly deals with the semantic processing of text by neural networks, i.e. analyzing the meaning of a text by focusing on the relation between its words and what they stand for in the real world and in their context. In the following, “words” (tokens) of a text comprise both words in the usual terminology of language as well as any units of a language which can be combined to form a text, such as symbols and signs. From these words, we disregard a set of all-too-ubiquitous words such as “the”, “he”, “at” et cet. which have little semantic relevance to leave what we call “keywords” of a text.
Applications of semantic text processing are widespread and encompass e.g. classification of text under certain keywords for relevance sorting, archiving, data mining and information retrieval purposes. Understanding the meaning of keywords in a text and predicting “meaningful” further keywords to occur in the text is for example useful for semantic query expansion in search engines. Last but not least, semantic text processing enhances the quality of machine translations by resolving ambiguities of a source text when considering its words in a larger semantic context.
Hitherto existing methods of semantic text processing, in particular for query expansion in search engines, work with large statistical indexes for keywords, their lemma (lexical roots) and statistical relations between the keywords to build large thesaurus files, statistics and dictionaries for relational analysis. Statistical methods are, however, limited in depth of semantic analysis when longer and more complex word sequences are considered.
On the other hand, neural networks are primarily used for recognizing patterns in complex and diverse data, such as object recognition in images or signal recognition in speech, music or measurement data. Neural networks have to be correctly “trained” with massive amounts of training data in order to be able to fulfill their recognition task when fed with “live” samples to be analyzed. Training a neural network is equivalent with configuring its internal connections and weights between its network nodes (“neurons”). The result of the training is a specific configuration of usually weighted connections within the neural network.
Training a neural network is a complex task on its own and involves setting a multitude of parameters with e.g. iterative or adaptive algorithms. Training algorithms for neural networks can therefore be considered as a technical means for building a neural network for a specific application.
While neural networks are currently in widespread use for pattern recognition in large amounts of numerical data, their application to text processing is at present limited by the form in which a text can be presented to a neural network in a machine-readable form.