The present invention relates to automated processing of text strings and, in particular, to techniques for identifying superphrases of text strings.
Automated extraction of the key concepts contained in a string of text is a challenging problem. Words present in such a string may provide clues as to what the string is about, but prior knowledge regarding the concepts represented by those words is typically required. This is an issue in a variety of contexts including, for example, the field of automated search in which text strings, i.e., search queries, are matched to documents using a wide variety of techniques. The problem arises because of the lack of constraints imposed on users generating queries. That is, different users looking for documents relating to the same subject matter may submit radically different queries which nevertheless represent the same underlying concept(s). And while the mapping to underlying concepts might be readily apparent to a human, conventional applications which employ an automated approach to parsing and responding to search queries are not capable of appreciating such connections.