The advent of modern computing and networking technologies has brought about an explosion of information that is becoming more and more available to the public. Widespread access to networks, such as the Internet and Intranets, has fueled robust growth in demand for both media content and delivery channels which, in turn, has increased the desire for rapid access to news and information, local content such as emails and electronic documents, and metadata pertaining thereto. Metadata is generally defined as “data about data.” In a content management and information architecture, metadata generally refers to information about objects, such entities. Thus, metadata can pertain to information about a document, an image, news stories, information on blogs, and so on.
A number of vendors, organizations, consortiums, international standards bodies, and working groups are developing (or have developed) metadata recommendations and standards. For example, the IFLA (The International Federation of Library Associations and Institutions), is an international body representing the interests of library and information services and their users. See IFLA website. The IETF (Internet Engineering Task Force) has a number of projects underway to define metadata usage on the Internet and Web, such as the Common Indexing Protocol (CIP), and the URN (Uniform Resource Name).
The Handle System® is a distributed computer system which stores names, or handles, of digital items and which can quickly resolve those names into the information necessary to locate and access the items. It was designed by CNRI (Corporation for National Research Initiatives®) as a general purpose global system for the reliable management of information on networks such as the Internet over long periods of time and is currently in use in a number of prototype projects, including efforts with the Library of Congress, the Defense Technical Information Center, the International DOI® (Digital Object Identifier) Foundation, and the National Music Publishers' Association.
In addition, the World Wide Web Consortium's (W3C) Metadata Activity Group is developing ways to model and encode metadata. The group has developed RDF (Resource Description Framework) and PICS (Platform for Internet Content Selection). See World Wide Web Consortium's website information pertaining to metadata. Finally, the Dublin Core is an attempt at standardizing a core set of metadata elements. RFC 2413 (Dublin Core Metadata for Resource Discovery, September 1998) describes the metadata elements. See Dublin Core website.
Descriptive metadata may describe information that identifies resources that enable searching and retrieving at the web-level. For example, descriptive metadata may be used to facilitate searching the Web to find an image collection pertaining to major league baseball players, and/or enable users to discover resources pertaining to digitized collections of information pertaining to the Civil War. Structural metadata may be used to facilitate navigation and presentation of electronic resources, and provide information about the internal structure of resources including page, section, chapter numbering, indexes, and table of contents. Structural metadata may also be used, for example, to describe relationships among materials (e.g., photograph B was included in manuscript A) and/or bind related files and scripts (e.g., File A is the JPEG format of the archival image File B).
The ability to quickly gather large amounts of unstructured content, such as news information, emails, and locally stored electronic documents and content, and distribute relevant information to end-users may provide a competitive advantage to such end users. For example, providing metadata pertaining to financial news stories to end users in a rapid manner may enable end users to acquire and use this information before others can gain access to and react to the information.
There are known systems that have been utilized in efforts to rapidly provide metadata to end users. Typically, these schemes complete the entire formation of metadata before transmitting any metadata to client computers or processing devices.
Aspects of the present invention are directed to formulating metadata pertaining to unstructured content such as news information, emails, and locally stored electronic documents, and to providing staged delivery of metadata, with each stage providing an increasing amount of metadata content, for example, to client computers and/or end-users in a manner that overcomes certain limitations associated with known systems and methods.