(1) Field of the Invention
The present invention is related to a method and an apparatus for computer networks such as the Internet, wide area networks (WANs), metropolitan area networks (MANs), and local area networks (LANs). More specifically, it provides a method and an apparatus that allows for correlating interests of people by tracking and analyzing their actions in a computer environment or based on their actions with respect to items cataloged in a computer environment.
(2) Background of the Invention
The Internet connects thousands of individuals as well as many disparate networks across the world in industries such as education, military, government, research, and others. The Internet utilizes transmission control protocol/Internet protocol (TCP/IP) as a standard for transmitting information. An intranet is a local area network supporting a single organization, for example a company or an educational institution. Through an intranet, users may partake in various activities such as e-mailing, web browsing, and transferring files. The growth of the Internet as a means of communication has been explosive, with the World Wide Web becoming a major marketing channel as well as a major source of information and e-mail becoming a major means of communication among the population.
Because of the computerized nature of the Internet and other networks, a rich source of tracking data is available which may be beneficially correlated. Through the use of various networks, people are able to communicate as well as to search for information from various sources such as web sites. The networking environment provides an opportunity for potentially meaningful and productive work-related interaction among users. To promote user interaction, it is desirable to correlate certain user history or access data and to make the correlation results available to other users. It is therefore desirable to have a method and apparatus for users connected to a network to access information regarding others with whom the users"" histories correlate, and to provide the users with data to allow them to determine others with similar, common interests.
In the prior art, methods have been developed for identifying people with common interests for the purpose of providing recommendations regarding various information sources. Systems of this type typically require explicit input of current user preferences in order to predict future preferences through computer analysis of content. In some cases, these systems require the system developer to categorize content items into predetermined classes. Therefore, it is also desirable to provide a system that does not require computer analysis of the content of information sources accessed, and which does not require that information be pre-categorized in any way. These characteristics are particularly advantageous if the types of information being accessed are multimedia rather than text, where performing meaningful content analysis of video or audio sources can be far more difficult than it is with text-only sources.
A related method for information filtering is known as collaborative filtering. Instead of attempting to analyze documents based on keywords or content, collaborative filtering techniques transform each user into the role of a critic or magazine editor. Any given individual is capable of deciding what they like or dislike, and whether the information they are looking at is relevant to their current interests or needs. The user merely has to organize and rank the information he or she sees in terms of his or her own personal evaluation criteria. If a number of users have similar evaluation criteria, sharing the results of their evaluations can provide each user with the benefits of exposure to a much broader range of relevant information. In this way, each member of the group serves as a xe2x80x9crecognition enginexe2x80x9d to identify and evaluate information that might be appropriate to share with other users. Because this evaluation is performed by human minds, associations based on deep understanding are possible for information in diverse formats, whether they be speech, images, text, or video.
The concept of collaborative filtering has been likened to the notion of automating the xe2x80x9cword of mouthxe2x80x9d process that works so well among friends and colleagues. Usually people know which of their friends or associates have similar tastes to their own. For example, when choosing a movie, people will most often ask the opinion of others who have likes and dislikes similar to their own. A recommendation from someone who is known to have similar tastes to our own will carry far more weight than one from another source.
In collaborative filtering techniques, user groupings are dynamic and may change as rapidly as users"" needs or interests change. Collaborative filtering techniques take advantage of the fact that there are thousands of users, both past and present, each having accessed a broad range of different items, and each having opinions about the information they obtain. A centralized server is used in collaborative filtering systems to act as the matchmaker needed to group people who have similar needs or interests. In order to benefit from their collective opinions, these users need not have ever heard of, met, or seen each other, and they may even be located at opposite ends of the world. All that matters is that they have given similar ratings to many of the same sources of information. These ratings alone can then be applied to suggest to a user new sources of information that he or she has not yet seen.
U.S. Pat. No. 4,996,642, entitled xe2x80x9cSystem and Method for Recommending Itemsxe2x80x9d and its related patent, U.S. Pat. No. 4,870,579 describe a recommendation system that uses collaborative filtering techniques. This system relies on explicit user ratings of items in order to perform clustering of users according to their common likes and dislikes. Furthermore, the output of this system is not intended to help people identify others like themselves, but to provide specific recommendations about items they may wish to use, rent, or purchase.
The present invention differs from these patents and methods in that it does not rely on explicit user input such as ratings. Because the system is intended to primarily match users with common interests rather than to provide recommendations to those users, the system does not require input about user opinions regarding information accessed. Instead, it is able to make use of data about users"" patterns of information access and their modes of use of the information once it is accessed.
U.S. Pat. No. 5,870,744, entitled xe2x80x9cVirtual People Networkingxe2x80x9d describes a system which allows multiple people working for the same organization with similar interests to automatically interface with each other when any one of the people accesses any given one of multiple electronic sites provided through an intranet of the organization. The system described tracks a user""s access pattern and provides the access pattern to other users upon request. The system also allows users to explicitly rate particular sites and to provide messages regarding a particular site to subsequent users who view their access patterns.
The present invention differs from this patent in that it does not simply provide user access patterns to other users. Rather, it correlates user access data and implicitly determines content similarity of sites through an analysis of access patterns. Furthermore, it provides an implicit interest rating system based on the number of times an individual user accesses a particular site. The rating system also takes into account the passage of time through the use of a decay factor, which degrades the determined user interest in a particular site over time.
Further references:
Goldberg, David et al., xe2x80x9cUsing Collaborative Filtering to Weave an Information Tapestry,xe2x80x9d Communications of the ACM, December 1992, Vol. 35, No. 12, pp. 61-70.
Maes, P. (1994) Social interface agents: acquiring competence by learning from users and other agents. In Working Notes of the AAAI Spring Symposium on Software Agents, Stanford, Calif. p. 71-78.
Shardanand, U., and Maes, P., (1995) Social Information Filtering: Algorithms for Automating xe2x80x9cWord of Mouth,xe2x80x9d appearing in CHI-95 Conference, Denver, Colo. May 1995.
In accordance with the present invention, a collaborator discovery method and system are presented. The method provides for collaborator discovery among a plurality of users, and generally includes the steps of: (a) providing a user history including a plurality of entries, with each entry including a user identity associated with each particular user and a reference to a particular item accessed by that user; (b) associating particular items in the user history by providing a measure of similarity between the particular items; (c) uniquely associating at least one scent score to each particular item accessed by a particular user (scent scores will be discussed in detail further below); (d) diffusing the at least one scent score associated with a particular item accessed by a particular user to another item by generating at least one diffusion scent score from the combination of the measure of similarity between the particular item and the other item and the at least one scent score, and incorporating the at least one diffusion scent score into the at least one scent score of the other item; (e) repeating step (d) for all items which have at least one scent score; and (f) determining scent match scores by correlating the scent scores from all of the particular items to find users with common interests. The user history may be generated by monitoring and recording the real-time accesses of the plurality of users, and steps (b) through (f) may be repeated a plurality of times to provide a continual update of the scent scores. The measure of similarity may be generated in a number of ways and based on a number of factors such as the temporal proximity of accesses between particular items. The scent scores may be increased over time in proportion to the number of times a particular item is accessed in order to provide a measure of a user""s interest in the item. Particular items, such as large, general interest Internet search engines or other items which are likely to be accessed frequently, but that are likely to yield little useful information regarding user interests, may be filtered out of the user history. After the user scent scores have been correlated, this information may be provided to the users in order to assist them in finding others with similar interests. To account for the difference between short-term and long-term user interests, different scent scores may be utilized with different rates of increase in order to help differentiate between users sharing only a passing, short-term, interest and those with similar long-term interests. The scent scores may also be decayed in order to account for changes in user interests over time. A messaging system such as a chat facility or an e-mail system (as well as e-mail blocking) may be provided to enable users to communicate with each other, and privacy enhancements may be added to provide for user anonymity.
The system of the present invention includes an activity monitor which provides a user history, with a plurality of entries, each including a user identity associated with a particular user and a reference to a particular item accessed by that user. The activity monitor may be centralized or it may be distributed among the users"" systems, or it may be a hybrid of the two. An entry processor is connected to the activity monitor to receive the plurality of entries of the user history from the activity monitor, and is operative to associate pairs of particular items in the user history to provide a measure of similarity for each pair, and to uniquely associate at least one scent score for each particular item accessed by a particular user. A match database is connected to the entry processor to receive and store the measure of similarity and the scent scores. A matcher is connected to the match database to receive the measure of similarity and the scent scores, and to diffuse the scent scores to other items in the user history in proportion to the measure of similarity and to correlate the scent scores of all of the particular items in the user history to determine users with common interests. The user history may be generated by monitoring and recording the real-time accesses of the users. The system may thus provide a continual update of the scent scores. The measure of similarity may be generated in a number of ways and based on a number of factors such as the temporal proximity of accesses between particular items. The scent scores may be increased over time in proportion to the number of times a particular item is accessed in order to provide a measure of a user""s interest in the item. A filter may be provided to eliminate from the user history particular items, such as large, general interest Internet search engines or other items that are likely to be accessed frequently, but that are likely to yield little useful information regarding user interests. After the user scent scores have been correlated, this information may be provided to the users in order to assist them in finding others with similar interests. To account for the difference between short-term and long-term user interests, different scent scores may be utilized with different rates of increase in order to help differentiate between users sharing only a passing, short-term, interest and those with similar long-term interests. A decay engine may be provided to decrease the scent scores in order to account for changes in user interests over time. A means for messaging such as a chat facility or an e-mail system may be provided to enable users to communicate with each other, and a means to provide user anonymity may be provided to allow for user privacy.