Typically identification of an entity can be presented in different linguistic forms. For example, the company “International Business Machines Corporation” is typically referred to as “IBM Corporation” or simply “IBM.” The existence of different morphological forms for the same entity can occur in queries or in documents and poses great challenges for search engines both at the search phase and at the result presentation phase.
At the search phase, a query posed by the user to search for an entity in an information retrieval system may be different from how that entity is described/identified in the underlying data. For example, when the user searches for the person “Fred Doe”, the name of that person in underlying data may be “Frederick Doe.” As such, the search engine needs to recognize that “Frederick Doe” is a good match for “Fred Doe” in order to provide the right results back to the user. Otherwise, the search engine may return less relevant results to the user.