In the online space where content is widely distributed, and turnover of content is frequent, the search and analysis of such content is difficult. These circumstances have not, however, made the accurate searching and analysis of online content any less desirable. A microcosm of this problem occurs in the context of companies and online earned media.
Earned media (or free media), which is publicity gained through methods or promotional efforts other than paid advertising, may be especially important to companies or other entities as it may be a cost effective way to market products or services that engenders some degree of trust in consumers. Assessment of earned media with respect to a given entity may be a difficult proposition for a variety of reasons, not the least of which is the lexical complexity of languages. For example, when an entity of interest is a common word or has many homonyms (e.g., the word “apple” may refer to the company “Apple”, a piece of fruit, etc.) it may be difficult to separate out relevant earned media from other content. Accordingly, the current methods for assessment of searching and analysis of content have proved woefully inadequate in meeting the desires of companies or other entities with respect to locating and assessing associated earned media.
As may be imagined, these issues are not confined to the earned media context. Thus, while earned media provides a relevant example for describing the inadequacies of these current systems, these inadequacies are not just germane to the example of earned media, but indeed are almost universally applicable across any online, networked environment where the search and analysis of electronic content is of importance.
In the main, the problems discussed exist because the current systems and methods for search or analysis of content utilize what is basically a brute force keyword search to determine relevant content. The reliance on keyword searching means that returned search result contain a large number of false positives (e.g., content that is returned that contains search term(s) but is not relevant) and omits a number of false negatives (relevant content that exists but is not returned in response to the search). To again utilize the earned media space as an example, when the company Apple wants to find earned media they do not want to see articles dealing with apples (the fruit) or other businesses that have the word apple in the name.
As a consequence, analyzing and finding desired content is currently a time-consuming and error prone process. What is desired are improved systems and methods for the search and analysis of online content.