1. Field of Invention
Aspects of embodiments relate to information processing. More particularly, aspects of embodiments relate to determining information content displayed by users of information processing systems including diverse sources of information and diverse display systems and methods.
2. Discussion of Related Art
Users of large networks, such as the Internet, interact with large numbers of diverse programs for generating, storing, communicating and otherwise manipulating user generated content. For example, a user of a social networking Internet site or component of a site may store contact information for their contacts in records on the site that include fields like name, address, and multiple phone numbers. A user of a photo/video Internet site or component of a site may store photo or video records that include the image data, tags, descriptions and comments. These user-generated contents are of diverse types and varying formats. Moreover, the structures and mark-up of the content can change frequently. For various purposes, including further manipulation of the information or data collection regarding the information, determining the structure and content of the information may be desired.
To manually induce the structure is not only tedious, but also very error-prone. Given the large number of data sources and the extreme diversity of structure used by them, a lot of computer and/or human time is required to analyze the data sources individually.
Normally, data transfer between programs is accomplished using data structures suited for automated processing by computers, not people. Such interchange formats and protocols are typically rigidly structured, well-documented, easily parsed, and keep ambiguity to a minimum. Very often, these transmissions are not human-readable at all.
In contrast, output intended to be human-readable is often the antithesis of this, with display formatting, redundant labels, superfluous commentary, hidden and embedded metadata and other information which is either irrelevant or inimical to automated processing. However, if the person, entity or computer program seeking to analyze data being transferred from one program to another or from an Internet site to a user can only intercept such human-oriented display data, screen scraping may be employed. In order to perform screen scraping, the structure of the display information must be known so that the content can be successfully parsed.