Information technology is often used to provide users with various types of information, such as text, audio, video, and any suitable other type of information. In some cases, information is provided to a user in response to an action that the user has taken. For example, information may be provided to a user in response to a search query input by the user or in response to the user's having subscribed to content such as an e-mail alert(s) or a electronic newsletter(s). In other cases, information is provided or “pushed” to a user without the user having specifically requested such information. For example, a user may occasionally be presented with advertisements or solicitations.
There is a vast array of content that can be provided to users via information technology. Indeed, because of the enormous volume of information available via the Internet, the World Wide Web (WWW), and any other suitable information provisioning sources, and because the available information is distributed across an enormous number of independently owned and operated networks and servers, locating information of interest to users presents challenges. Similar challenges exist when the information of interest is distributed across large private networks.
Search engines have been developed to aid users in locating desired content on the Internet. A search engine is a computer program that receives a search query from a user (e.g., in the form of a set of keywords) indicative of content desired by the user, and returns information and/or hyperlinks to information that the search engine determines to be relevant to the user's search query.
Search engines typically work by retrieving a large number of WWW web pages and/or other content using a computer program called a “web crawler” that explores the WWW in an automated fashion (e.g., following every hyperlink that it comes across in each web page that it browses). The located web pages and/or content are analyzed and information about the web pages or content is stored in an index. When a user or an application issues a search query to the search engine, the search engine uses the index to identify the web pages and/or content that it determines to best match the user's search query and returns a list of results with the best-matching web pages and/or content. Frequently, this list is in the form of one or more web pages that include a set of hyperlinks to the web pages and/or content determined to best match the user's search query.
The sheer volume of content accessible via digital information systems presents a number of information retrieval problems. One challenging problem is how to determine what information, in a large set of content, may be of interest to users so that such information may be presented to the users without overwhelming them with irrelevant information. A related problem is how to determine what information may be of interest to users that may be searching for information in a large set of content by using terms that appear infrequently in the set of content being searched. Accordingly, the inventors have recognized the need for techniques for identifying information of interest to users in a large set of content and presenting such content to the users. These needs are addressed herein with a new methodology and applications.