Searching information about entities (i.e., people, locations, organizations) in a large amount of documents, including sources such as the World Wide Web, may often be ambiguous, which may lead to imprecise text processing functions and thus imprecise data analysis. For example, a reference to “Paris,” could refer to a city in the country of France, cities in the States of Texas, Tennessee or Illinois, or even a person (e.g., “Paris Hilton”). Associating entities with co-occurring features may prove helpful in disambiguating different entities.
Large companies or organizations may contain vast amounts of information stored in large electronic document repositories. Generally, information stored in document format may be written in an unstructured manner. Searching or identifying specific information in these document repositories may be tedious and/or troublesome. Identifying co-occurrence of different features together with entities, topics, events, keywords and the like in a document corpus may help to better identify specific information in the same. The need for intelligent electronic assistants to aid in locating and/or discovering useful or desired information amongst the vast amounts of data may be significant.
Thus a need exists for an intelligent electronic system for detecting and recording co-occurring features in a corpus of documents.