The present invention relates generally to a corpus pattern paraphrasing system, and more particularly, but not by way of limitation, to a system for generalizing over individual instances to obtain patterns of paraphrases without requiring data in a corpus of sentences.
Mapping verbal usage to regular expressions have been considered. Conventional techniques proved that regular expressions extracted corpora can be learned and they are instrumental to a wide range of applications involving semantic processing. Such conventional techniques involve using of ontological categories.
Other conventional techniques rely on bags of words (i.e., a fixed number of lexical features) in order to predict the meaning of input content.
In order to paraphrase sentences, conventional techniques have completely relied on the information in a stored corpus of sentences and are incapable of generating new paraphrases unless the data is stored in the corpus of sentences.
However, there is a technical problem with the conventional techniques that prediction paraphrasing cannot be done if the related paraphrased sentence is stored in the corpus of sentences which limits the prediction capabilities to that of a database.