1. Field of the Art
This invention relates generally to information extraction and distribution and more particularly to the extraction and distribution of customizable and measurable information feeds to users.
2. Description of the Related Arts
Many entities spend millions of dollars in communicating with their constituencies. These constituencies may represent current, past and potential customers, employees, shareholders, and business partners. However, it is a challenging task to effectively communicate with such constituencies. For example, it is difficult for a business to deliver the right marketing message to the right customer at the right time.
Typically, entities use telephone, face-to-face meetings, advertising, web sites, and e-mails to communicate with their constituencies. However, each of these methods has limitations. Telephone calls and face-to-face meetings are unable to reach a broad number of constituencies in a finite amount of time. Advertising may be poorly targeted and cost ineffective. Web sites may have difficulty getting repeat visitors. E-mails are facing limitations due to viruses, spam, customer resentment and apathy, and the lack of personalization. In addition, recent legislations such as the Controlling the Assault of Non-Solicited Pornography and Marketing Act of 2003 impose significant restrictions on entities in their use of e-mail correspondence.
Some entities try to overcome these limitations by using information syndication technologies such as rich site summary or really (or real) simple syndication (“RSS”) and Atom, both of which are generally referred to as feeds. Feeds consist of information in a file with extensible markup language (“XML”) tags and saving the file in a server such as a web server. Users can use client-side agents such as aggregators, portals, or browsers to monitor these files, understand changes to information (e.g., via the XML metadata), and download updates if appropriate. Feeds have many advantages over traditional communication methods, including cost effectiveness, potential higher user opt-ins (e.g., since an e-mail address is not necessary to subscribe to a feed), compliance with related laws and regulations, and presently, a lack of viruses and spam.
Nevertheless, feeds have many limitations, including the difficulty for non-technical people to create a feed, the lack of personalization, and the lack of a way to measure the effectiveness of feed communications. For example, early adopters of feed publishing have hundreds of feeds on their web sites. Thus, users are forced to guess and select which feeds are desirable to them. In addition, it is difficult for an entity to understand which feeds, if any, are effective in meeting its communication objectives, since there is not an available method to measure and analyze the effectiveness of feed communications. Furthermore, designing an effective feed is difficult since entities are communicating with a client-side agent to gain a user's attention. Such a client-side agent may become increasingly sophisticated and vital to an entity's objectives.
To address the problems of end-user usability, an auto-discovery technique was developed for a client-side agent to automatically discover the availability of feeds on a particular network location. In this auto-discovery technique, a user's client-side agent searches pages on a web site to look for a hypertext markup language (“HTML”) tag that indicates support for feeds. The client-side agent then places a universal resource identifier (“URI”) such as a uniform resource locator (“URL”) of the feed into the client-side agent to allow the user to subscribe to the feed. However, even though auto-discovery provides the ease of discovering feeds, it still lacks the ability to create personalized feeds.
In addition, entities and their constituencies' information needs change over time. For example, a business's products and services may be introduced, sold, supported, and ultimately removed from the market place. A customer's interest in a business' products and services may change based on competitors' pricing. As a result, the business may no longer have information to send to a particular customer via a feed, and the customer may find the feed less relevant to his or her interest. One solution is to insert content into the feed suggesting the customer to subscribe to a new feed. But requiring the customer to unsubscribe, visit a web site, and re-subscribe to a new feed is a hassle for the customer, which may eventually decrease feed subscriptions.
In addition, as feeds are delivered by means of a URL, such URLs can be discovered by software agents and/or shared with other users via a variety of methods such as Outline Processor Markup Language (OPML), search engine and directories. This presents problems for entities that wish to deliver personalized information via feeds and/or wish to measure feed use on a per subscriber basis. It also presents problems for subscribers who may get irrelevant content and/or who may intentionally or unintentionally customize another subscriber's feed.
Further, feeds are taxing on systems that serve the particular feeds due to automatic user agents polling the server continuously (at a preset interval) for information updates. This may cause either severe spikes in load for the servers or cause bandwidth spikes that would exceed thresholds and thus result in excessive charges. This problem may get worse as more real time data is placed in RSS Feeds and user agents increase the frequency of their requests. In addition, if systems are unavailable due to maintenance or failure, user agents typically return error messages, an unsatisfactory experience for users.
Still another problem is the labor intensiveness necessary to maintain a feed. This is often compounded by repetitive efforts of maintaining a web site along with separately maintaining a feed. Hence, duplicative efforts are necessary to keep multiple sources updated.
In addition, options for formatting feeds are limited to manual tools. There are limitations associated with formatting feeds, which include issues involving control of publishing processes, getting subscribers relevant content, content appearance, and measurability relative to supplied feeds. Thus, there remains a difficulty in extracting information for a feed as well as formatting that information into a feed.
Therefore, in view of these shortcomings in the art, there is a need for (1) a technique that allows feed personalization in an auto-discovery environment, (2) feeds that provide continuous monitoring of feed use to enhance feed relevancy and personalization, (3) securing, authenticating and identifying feed publishers, feeds and feed subscribers, as well as (4) distributing the load and handling availability while maintaining an entities desired quality of service as well as (5) an automated method for capturing updates from a resource, and extracting relevant data into a feed with appropriate formatting.