For over fifty years, sociologists have employed Social Network Analysis, sometimes referred to as Organizational Network Analysis, to map human relationships, typically by conducting a questionnaire-based survey in which each subject defines his/her relationship with the other individuals in the group that is being examined. Once the data have been gathered and entered, network analysis and network plotting are employed to visualize and characterize the network (for instance isolating connected components or calculating the average distance between individuals).
A similar approach can be used to examine public debates. Instead of collecting the data through the use of individual surveys, the researcher may extract the information from media and other public records, for instance extracting the names of individuals who are quoted in the debate about the economics of wind power from news paper coverage.
This approach here, which we have named Influencer Network Analysis (INA), concerns a method for the automatic discovery of relationships in media coverage through text mining and information extraction and subsequent analytical processes to produce network visualizations and reports which can assist corporations and other organizations to understand, measure and predict media coverage, and to plan and implement efficient communication strategies.
The method is applied on a project basis, usually focusing on a particular topic, issue, company or brand. The high level of automation permits economic processing of hundreds or thousands of articles from which both structured (fields) and unstructured (text, for instance a news paper article) information is extracted.
Given the high volume generated in the media on certain issues, it is of key operational importance that as much data preparation and data mining as possible can be performed by a computer in an automated, unsupervised modus. In order to perform this task computationally, the system must be able to extract the core entities in media reports automatically.
Information extraction is applied over the texts for the automatic recognition and extraction of named entities and marking them into predefined categories such as persons, organizations, locations, brands, etc. The named entity extraction system is based on linguistic grammar-based techniques as well as statistical methods.