The invention relates to the field of automatic analysis of documents and the use of the results of such analyses.
By “document” is meant here a set of data representing known or recognisable characters. It may be, in particular, a text made up of an ordered sequence of verbal entities such as words, groups of words, figures or alpha-numeric groups, for example.
Moreover, the term “analysis” here means any type of check intended to determine whether a document has a meaning, possibly taking into account its context.
Moreover, the phrase “use of the results” here denotes any operation or process that can be applied to an analysed document, for example with a view to translation, optionally simultaneous, or with a view to information filtering (for example within the framework of electronic messaging management) or for the purpose of correcting spelling and/or grammar, or with a view to transcribing voice dictation, or for the purpose of generating texts (such as abstracts or summaries), or for the purpose of carrying out a search, using a search engine, into textual information accessible on private or public network servers (such as internet servers).
Numerous applications can be used to process plain language. They are based on different techniques such as, for example, syntax analysers, semantic networks or Bayesian models, sometimes associated with neuronal networks or modal fuzzy logic.
These techniques have certain advantages over first generation search engines which were limited by the use of key words.
However, in some fields, these techniques are inadequate or even useless in the matter of the processing of plain language owing to the fact that they neglect some of the information contained in the documents that are to be analysed.