The present invention relates to content provisioning, and more particularly, but not exclusively, to a system and method for syndicated data stream content provisioning.
As the volume of data accessible via computer systems continues to increase, the need for automated tools for assisting the user to home in valuable for him information within the data also increases. Queries to search engines are routinely employed to find relevant information on the many web pages or enterprise data.
A substantial portion of consumption of digitized content gradually shifts from query-driven search in the web or enterprise digital libraries to passively following the information continuously made available though various syndicated data streams of content. The syndicated data streams include, but are not limited, to RSS (Really Simple Syndication) feeds, streams of user messages on social network platforms (say, Twitter™), Atom syndication format feeds, e-mail messages from a mail server, web site content delivered using a dedicated API (such as Facebook™ connect), etc., as known in the art.
Typically, a user subscribes to one or more syndicated data streams, such as an RSS news feed, a web feed, or a stream of user messages in a social network platform, as known in the art. Optionally, the data stream is syndicated by a content provider (say by a news network, such as CNN™), by a Web Blogger, etc. The syndicated data stream is used by the content provider, to provide users who subscribe to the syndicated data stream with a frequently updated, structured list of content objects.
A content object is a data object with a usually short textual description which may be presented to the user. The textual description may include a content item itself (say a user message published on a social network platform), a button which presents a content item to the user upon clicking on the button, a hyperlink which links the user to a page of a web site, a link which initiates downloading of the content item, etc., as known in the art.
In a typical current scenario of syndicated data stream provisioning, a content provider publishes a feed link on the provider's web site. End-users can subscribe to the feed, using an aggregator program (also called a feed reader or a news reader). The aggregator program runs on the end-users' machines. The subscription process is usually as simple as dragging the link from the web browser to the aggregator. When instructed, the aggregator asks all the servers in its feed list if they have new content. If so, the aggregator either makes a note of the new content or downloads it. Aggregators can be scheduled to check for new content periodically.
The motivation for the shift from query-driven search in the web or enterprise digital libraries to passively following the information continuously made available though various syndicated data streams is manifold. In particular, users don't want to miss information the users consider valuable and yet the interests of the users are typically wide and dynamic, and thus it is not easy to express the user's interests via standard queries.
Subscription-based information consumption from syndicated data streams, such as RSS feeds, or a user specific Twitter™ feeds of messages from Twitter™ followers of a user, provides the user with a potential of timely access to information valuable for him. However, the subscription-based information further creates an overload of data provided to the user from one or more data streams.
It is unavoidable that most pieces of information (i.e. content objects, say video clips, news reports, articles, Twitter™ short messages (“Twits”), etc.) distributed via even a single syndicated data stream (say a single RSS feed) are optimized for a specific single user.
Even if all the streams the user subscribed to are fully personal, the multitude of such streams may still create the effect of an information overload. Consequently, the user has to separate signal from noise, and browse through multitude of content objects, in an attempt to locate the content objects that are more valuable for the user. As this process is time consuming, for many users getting to the valuable content becomes virtually impossible.
Several current systems target information overload on the user and aim at reducing the burden put on the user.
For example, the Google™ Reader system allows the user to browse through content objects not only in the chronological order of creation, but also in an order reflecting the rate at which various streams the user subscribed to are updated.
FeedHub™′, an mSpoke™ powered system, exploits content categorization, together with absolute, explicit feedback of a user on content objects (of the form “I like this item”), to establish a relevance ranking, which is a relative numerical estimate of the statistical likelihood that a certain content object is of interest to the user. The efficiency of this method heavily relies on the user's willingness to continuously provide explicit feedback on content objects provided to him, as well as on credibility and semantic consistency of the explicit feedback.
U.S. Pat. No. 7,089,237, to Turnbull et al., filed on December Jan. 26, 2001, discloses a search and recommendation system, which employs the preferences and profiles of individual users and groups within a community of users.
U.S. Pat. No. 7,345,232, to Toivonen, filed on Nov. 6, 2003, describes a play list selection module, which creates a play list of media pieces stored in databases accessible by the selection module. The play list includes media pieces that are adapted to a specific user together with randomly selected media pieces, in a ratio determined by a surprise factor provided by the specific user. The play list may begin with a set of media pieces that are known to be enjoyed by the specific user, say media pieces previously consumed by the specific user, and ends with the randomly selected media pieces.
U.S. patent application Ser. No. 10/727,444, to Karnawat, filed on Dec. 3, 2003, describes a personalized internet search system based on a context-based user feedback gathered regarding searches performed on a search mechanism.
U.S. patent application Ser. No. 10/861,154, to Michelitsch et al., filed on Jun. 4, 2004, describes a content recommendation device with user feedback The device described by Michelitsch includes a selection engine that selects content objects from a content object pool according to a user profile.
U.S. Pat. No. 7,089,194, to Berstis et al., filed on Jun. 17, 1999, describes a method and apparatus for adaptively targeting advertisements to a specific client computer from a server within a distributed data processing system. As a user of the client browses the internet, the material that is downloaded to the client is scanned to generate a list of keywords. The keywords are used to select an advertisement from a database, the selected advertisement is inserted into the material, and the material and selected advertisement are presented together to the user.