The present invention relates generally to using communication and information networks. More particularly, the present invention provides methods for creating personalized profiles for users on a client computer based on the user""s activities.
Usage of communication networks, such as the Internet, has increased exponentially in recent years. Users of the Internet perform a broad variety of activities ranging from activities for accessing information such as news, weather information, sports related information, stocks information, etc., to performing electronic commerce (e-commerce) related activities such as buying or selling goods/services, and other similar activities.
Computer systems connected to the Internet are classified as xe2x80x9cclientsxe2x80x9d or xe2x80x9cserversxe2x80x9d depending on the role the computer systems play with respect to requesting information or providing information. Client computers typically request information from a server computer which provides the information. Server systems are typically responsible for receiving information requests from client systems, performing processing required to satisfy the requests, and for forwarding the results corresponding to the information requests back to the requesting client systems. The processing required to satisfy the client request may be performed by a single server or may alternatively be delegated to other servers connected to the communication network, such as the Internet.
The World Wide Web (the xe2x80x9cWebxe2x80x9d) enables users of the Internet to conveniently access resources offered by the Internet. In the Web environment, information resources are typically stored in the form of hypertext documents called xe2x80x9cweb pagesxe2x80x9d which can be accessed and read by users of the Web. A web page may incorporate any combination of text, graphics, audio and video content, software programs, and other data. Web pages may also contain hypertext links to other web pages. Web pages are typically stored on web servers coupled to the Internet. Each web page is uniquely identified by an address called a Uniform Resource Locator (URL) that enables users to access the web page.
Users typically access web pages using a program called a xe2x80x9cweb browserxe2x80x9d which executes on a client computer coupled to the Internet. The web browser is a type of client application that enables users to select, retrieve, and perceive resources on the Web. In particular, web browsers allow users to access and view web pages on a computer monitor. Examples of browsers include the Microsoft(copyright) Internet Explorer browser program provided by Microsoft(copyright) Corporation, and the Netscape(copyright) Navigator browser provided by Netscape(copyright) Corporation, and others. Users access web pages by providing URL information to the browser, either directly or indirectly, and the browser responds by retrieving the corresponding requested web page from the Internet. The retrieved web page may then be displayed on the client computer.
Due to the rapid increase in the number of web pages accessible via the Internet, it is becoming increasingly difficult for users to locate web pages which are relevant or of interest to the users. In order to find relevant web pages, a user is often forced to sift through large volumes of information and web pages, most of which are irrelevant to the user. Consequently, accessing web pages can often be a time consuming activity.
Several techniques have been developed to reduce the time that a user has to spend in accessing web pages or information of interest to the user. According to one technique, web pages are classified into subject categories which are displayed to the user as hypertext or URL links. Upon selection of a particular subject category, a list of web page links classified under the subject category are displayed to the user. Such a technique is used by Yahoo(trademark) which organizes information available over the web into categories such as xe2x80x9cNews and Media,xe2x80x9d xe2x80x9cRecreation and Sports,xe2x80x9d xe2x80x9cEntertainment,xe2x80x9d etc. While this technique provides some organization of information/web pages available via the Internet, the subject categories are usually not sufficient to locate information of interest to the user. Since each subject category typically includes a large number of web pages, another search within the subject category is typically necessitated to locate web pages of interest to the user. Additionally, the subject categories are static and thus cannot be customized for a particular user""s specific needs.
Other techniques allow users to build personal web pages and to customize the contents of the web pages. Such a technique is used by Yahoo(trademark) for their My Yahoo(trademark) service. While this technique is an improvement over the xe2x80x9csubject categoryxe2x80x9d techniques described above, it has a drawback in that it presumes that the user has prior knowledge of web pages which are of interest to the user. Web pages which may have been of interest to the user, if known by the user, cannot be facilitated by this technique. Further, information regarding the contents of a personalized user web page is usually stored on a web server remote from the user""s client computer. This raises several security concerns for the user since the user has very little control over the collection and dissemination of the personalized information.
More sophisticated techniques facilitate a user""s web activities by collecting information about the user, either explicitly or implicitly. These techniques are typically associated with a particular website, and monitor and record a user""s interactions with web pages hosted by the website. Explicit information collection techniques typically solicit information from the user via web-based forms, questionnaires, surveys, opinion polls, and the like. Conventional implicit information collection techniques typically collect information using xe2x80x9ccookiesxe2x80x9d or other inferential tracking programs. These implicit techniques are able to collect user related information without any effort or attention from the user.
In the context of the Internet and the WWW, a xe2x80x9ccookiexe2x80x9d generally refers to a block of data that a web server stores on a client computer. The cookie is a block of data which is configured by the server (typically a web server) to monitor and record information related to a user""s activities associated with one or more web pages hosted by the web server. The user related information typically includes information about selections, purchases, etc. made by the user at web pages hosted by the web server. The information stored by a cookie is generally accessed and used by the server when the particular server or web page is accessed again by the client computer. Cookies may be used by web servers to identify users, to instruct the server to send a customized version of the requested web page to the client computer, to submit account information for the user, and the like. Explicit and implicit user information collection techniques are used by a large number of web-based providers of goods and services including Amazon(trademark), DoubleClick(trademark), and the like. In some instances, the user information gathered by the servers is used to create customized profiles for the users which are stored on the web servers. The customized profiles generally summarize the user""s activities at one or more web pages associated with the servers.
A major drawback of conventional user related information gathering techniques is that the user has very little control on the information gathering process. This is because the information is usually gathered without the user""s permission by processes resident on web servers which are typically remote from the client computer used by the user. The user typically has no control either on the contents of the collected information or on when the information is collected. This lack of control raises several security concerns for the user.
Thus, there is a need for a method which facilitates collection of user related information while minimizing the problems associated with conventional techniques. It is further desired that the user have complete control over the collection and dissemination of the information.
The present invention provides methods for creating personalized user profiles on a client computer based on user activities associated with the client computer, and other user specific information accessible to the client computer. According to an embodiment, the present invention monitors user activities at a client computer, the user activities including user interactions with a browser program executing on the client computer. User information, including content and context information, is then collected based on the monitored user activities. The client computer then processes the user information to generate a personalized profile for the user which is stored on the client computer. User profiles are thus created locally on a client computer without any remote server intervention.
According to an embodiment, the present invention provides methods which are executed by the client computer and which determine a plurality of concepts from the collected user information. For each concept, the client computer determines the user""s level of interest in the concept. A value quantifying the user""s level of interest is then associated with each concept. Personalized user profiles are then generated based on the concept information and values associated with the concepts.
According to an embodiment of the present invention, the user interactions with the browser program which are monitored by the present invention include web browsing activities, search activities using the browser program, electronic commerce transaction activities, electronic mail related activities, financial activities performed by the user using the browser program, interactive activities performed by the user using the browser program, and the like.
According to an embodiment of the present invention, content information collected by the present invention may include contents of web pages accessed by the user using the browser, URL information for the web pages accessed by the user, title information of the web pages accessed by the user, information on searches performed by the user using the browser program, information on transactions performed by the user using the browser program, information input by the user to the browser program, information related to links on web pages accessed by the user, and the other like information.
According to an embodiment of the present invention, context information collected by the client computer includes information related to date and time when the user performed the user interactions with the browser program or when the user accessed web pages via the browser program, information related to amount of time spent by the user viewing the web pages accessed via the browser program, information on servers hosting the web pages accessed by the user, information regarding order in which the user accessed the web pages, and the like.
According to an embodiment of the present invention, only user-permitted activities are monitored, and only user-permitted information is collected. The user interactions monitored by the client computer may include the user""s interactions with other external devices which are capable of exchanging information with the client computer. The present invention also monitors user""s interactions with various application executing on the client computer.