The present invention relates generally to computer-implemented methods for generating user interest profiles for use in information content push systems and other information systems. In particular, the present invention relates to generating user interest profiles by monitoring and analyzing a user""s access to a variety of hierarchical levels within structured documents.
Personalized information delivery becomes increasingly important as the Internet continues to grow at an exponential rate. Users are already overwhelmed with the amount of information available on the web and need support in screening out irrelevant documents. To address this need, there presently exist various webcasting or xe2x80x9cpushxe2x80x9d services that deliver customized information to the user based on a personal profile. One of the main drawbacks in personalized information delivery or Webcasting systems today is that the user has to extensively interact with the delivery engine in order to customize his or her interest profile. Some customization schemes use Boolean search expressions while others employ relevance feedback from the user to derive the interest profile. This manual customization process is tedious and time consuming. Moreover, novice users have to learn how to interact with the delivery engine to get the desired results. The users also need to manually adjust their profiles every time they change their interests. There is a need, therefore, for a system and method that overcomes the above problems with current webcasting techniques.
The present invention provides a profiling technique that generates user interest profiles by monitoring and analyzing a user""s access to a variety of hierarchical levels within a set of structured documents, e.g., documents available at a web site. Each information document has parts associated with it and the documents are classified into categories using a known taxonomy. In other words, each document is hierarchically structured into parts, and the set of documents is classified as well. The user interest profiles are automatically generated based on the type of content viewed by the user. The type of content is determined by the text within the parts of the documents viewed and the classifications of the documents viewed. In addition, the profiles also are generated based on other factors including the frequency and currency of visits to documents having a given classification, and/or the hierarchical depth of the levels or parts of the documents viewed. User profiles include an interest category code and an interest score to indicate a level of interest in a particular category. Unlike static registration information, the profiles in this invention are constantly changing to more accurately reflect the current interests of an individual.
The key benefit this invention offers is that it automatically generates user profiles based on the type of content viewed, determined from classifications and categorizations of the content. A sampling scheme effectively presents carefully chosen documents with those documents that match the current profile in order to add new categories and delete old ones from the profile.