The present disclosure relates in general to extracting, organizing and analyzing intelligence gathered from the World Wide Web. More specifically, the present disclosure relates to systems and methodologies for efficiently extracting, organizing and analyzing targeted intelligence from a variety of web locations, such as blogs, forums, news sites, Twitter, Facebook and others.
The World Wide Web is a system of interlinked hypertext documents that are accessed via the internet. With a web browser, an entity can view web data that may contain text, images, videos, and other multimedia and navigate between them via hyperlinks. Entities can also create and post web data containing text, images, videos and other multimedia. Thus, the web contains a vast amount of public commentary data on a vast array of subjects that have the potential to provide useful intelligence to a given entity. For example, the nature and scope of a complaint posted on an internet forum or social network about the wait times at hospitals is potentially useful competitive intelligence for a healthcare provider.
However, the high volume and diversity of raw, unstructured web data make it a challenge to transform it into structured, meaningful and useful intelligence. To address this challenge, so-called social media monitoring (SMM) tools have been developed to gather and analyze web data. Although the term “social media” implies a focus on “social” sites such as Facebook or channels such as Twitter, SMM search tools pull web data from a variety of location types such as blogs, forums, news sites, review sites, and others. A typical SMM search tool works by crawling web locations continuously and tagging them. Once tagged, the web locations are searched using some form of keyword-based query or search string that a user develops to find so-called “mentions” of specific words and phrases on the tagged pages. The SMM search tool then brings these “mentions” back into the tool's interface, which can then be read and organized in different ways.
One way to convey intelligence about web data is through social media indices. Examples of social media indices include The Nation's Restaurant News (NRN) Social 200 (available on the worldwide web) and The Wine Industry Social Media Index (available on the worldwide web). The NRN Social 200 index in particular is a daily look and ranking of the social media activities of the nation's largest restaurant chains. This index quantifies restaurant brand efforts and consumer engagement with a scoring from 0-1,000. A typical index is intended to summarize and aggregate disparate data into a simple and general format. Thus, indices are intentionally very broad, not nuanced, and not particularly actionable at an entity level. For example, when Red Lobster sees that its 7-day NRN Social 200 index has gone up or down by 4.34%, it is difficult for Red Lobster executives to really understand what is driving the change and take targeted, responsive action, if necessary.
If entities desire to move beyond broad social media indices and extract web intelligence targeted to the top-level inquiries that are important to their space, they typically must make a tradeoff between the complexity/sophistication of their inquiry and the efficiency, cost and complexity of the resources needed in order to provide reliable and useful web data responses at a more granular level. This is because such tasks rely heavily on keyword-based SMM search tools, and there is generally an inverse relationship between the complexity/sophistication of an initial inquiry and the reliability of the search results returned by keyword-based SMM search tools. In order to provide useful and reliable web data in response to complex and sophisticated inquiries, entities must typically apply ad hoc, labor intensive and unsystematic analysis on top of the keyword-based SMM search results.