The present invention relates to browse trees and other types of hierarchical browse structures used to help users locate online content. More specifically, the invention relates to methods for automatically identifying and calling to the attention of users the nodes (categories and/or items) of a browse tree that are the most popular, or are otherwise predicted to be interesting to users.
With the increasing popularity of the Internet and the World Wide Web, it has become common for merchants to set up Web sites for marketing and selling products and services. One example of such a Web site is the online site of AMAZON.COM, the assignee of the present invention. Via this site, consumers can access and place orders from an online catalog that includes millions of book titles, compact discs, gifts, items for auction, etc.
Many online merchants and other businesses group their products, services or other items into a set of categories and subcategories of a browse tree. For example, the Yahoo Web site (www.yahoo.com) includes a browse tree which acts as a general Web directory, the Ebay Web site (ebay.com) includes a browse tree for locating auction-related content (auction events, etc.), and the Amazon.com Web site includes a subject-based browse tree for locating book titles.
One problem commonly encountered by online merchants is the inability to effectively present their goods and services to consumers via their browse trees. Due to the large number of items and item categories, many xe2x80x9cpopularxe2x80x9d categories and items (those that have experienced significant user activity) remain hidden from the user. For example, when a user begins navigation of a typical browse tree for locating books, the user initially sees a list of categories that broadly describe different book subjects. At this point, the user normally would not see more specific categories such as xe2x80x9cOlympics,xe2x80x9d even though xe2x80x9cOlympicsxe2x80x9d may be the most popular category at that time. The xe2x80x9cOlympicsxe2x80x9d category may be nested within the browse tree under Books/Sports and Outdoors/Events/Olympics, requiring the user to navigate downward through multiple levels of the tree to find the category. Similarly, the user would not see the most popular books (e.g., the current bestsellers) because they too would be nested within the browse tree (typically at the lowest level). Further, once the user locates the popular categories and book titles, the user typically has no reason to believe that they are currently the most popular. The ability for users to identify the most popular items and categories helps the users locate items that have gained acceptance within a community or within the population at large.
The present invention addresses these and other problems by providing a computer-implemented system and method for automatically identifying the most xe2x80x9cpopularxe2x80x9d nodes (categories and/or items) within a browse tree or other hierarchical browse structure, and for calling such nodes to the attention of users during navigation of the browse structure. The system and method are particularly useful for assisting users in locating popular products (e.g., books) and/or product categories within a catalog of an online merchant, but may be used in connection with browse structures used to locate other types of items, such as online auctions, chat rooms, and Web sites.
The node popularity levels are preferably determined periodically based on user activity data that reflects users"" affinities for particular nodes. The criteria used to measure such popularity levels depend upon the nature and purpose of the browse tree. For example, in the context of a tree used to locate items sold by a merchant, the popularity of each item may be based on one or more of the following, among other, criterion: the number of times the item was purchased, the number of times the item was viewed (within and/or outside the browse tree), the number of times the item was rated or reviewed, and the average rating of the item. The popularity of each category of the same tree may be based on one or more of the following, among other, criterion: the average popularity of the items contained within the category, the number of purchases made within the category relative to the number of items in the category, the number times the category was selected (xe2x80x9cclicked throughxe2x80x9d) or searched, and the number of times the category was selected as a destination node of the tree. The specific criteria used within a given system are largely a matter of design choice, and may be varied in order to achieve a particular objective.
The popular nodes are preferably called to the attention of users by automatically xe2x80x9celevatingxe2x80x9d the nodes along child-parent paths for display within the browse structure. For example, when the user selects a particular non-leaf category (a category that contains subcategories) for viewing, the most popular items corresponding to the selected category may be displayed together with (e.g., on the same Web page as) the immediate subcategories, allowing the user to view or directly access these items without navigating to lower levels of the browse tree. Subcategories may be elevated for display in a similar manner.
In a preferred embodiment, the various popularity criteria are incorporated into a scoring algorithm which is used to generate a popularity score for each node that is a candidate for elevation. These scores are then used to elevate the nodes within the tree. The nodes are preferably selected for elevation recursively, on a node-by-node basis, by selecting the most popular nodes (e.g., the 3 nodes with highest scores) from the level below. The most popular nodes are therefore propagated to the highest levels of the tree.
Preferably, the node popularity levels are determined periodically (e.g., once per hour) based on user activity data collected over a predetermined period or window of time (e.g., the last week or month). As a result, the nodes that are elevated for display change over time to reflect the current interests of users. In one embodiment, nodes are selected for elevation based solely on collective activity data, without regard to user identity. In another embodiment, information known about the individual user is incorporated into the selection process to select nodes that reflect the predicted or known interests of the particular user.
The system may also use community affiliations as a basis for selecting nodes to be elevated. For example, the nodes may be scored and elevated based in-whole or in-part on activity data collected for the particular community or communities of which the user is a member. The communities may include xe2x80x9cexplicit membershipxe2x80x9d communities (communities that users explicitly join) and xe2x80x9cimplicit-membershipxe2x80x9d communities (communities for which membership is based on information known about the user, such as the user""s email domain, Internet service provider, purchase history, or shipping address).
In one embodiment, for example, the popularity score for each node is calculated as the sum of three components: a personal score which is based on the actions of the particular user, a community score which is based on the actions of the members of the user"" community or communities, and a collective score which is based on the actions of all customers of the system. The time windows that are applied to the activity data for purposes of generating these component scores may differ; for example, it may be desirable to use a longer window for generating the personal scores (to increase the likelihood of capturing relevant personal activity data), and use a shorter window for generating the collective scores.
In an embodiment for use by an online bookseller, the system and method are used to xe2x80x9cfeaturexe2x80x9d the most popular book titles and leaf categories on Web pages corresponding to higher-level categories. The most popular books and categories are preferably determined periodically based on purchase counts, category click-through rates, and/or other types of user activity data. The nodes to be featured are preferably selected recursively, on a node-by-node basis, by selecting the most popular nodes from the immediate children of the current node. Books and low-level categories that are currently very popular thus tend to be featured at many different levels of the tree, increasing the probability of exposure in proportion to level of popularity. Preferably, the nodes are selected for elevation based on a combination of user-specific and collective user activity data, so that the featured books and categories reflect both the interests of the particular user and the interests of others.
In an online auctions embodiment in which the nodes represent auction events, the node popularity levels may be based, for example, on the number or frequency of bids. In this embodiment, auctions that experience relatively heavy bidding activity tend to be elevated within the tree. Other criteria, such as the number of bidders, the average bid increment, the difference between the current bid and the asking price, and the average rating of the seller may additionally or alternatively be used.
The invention may also be used to highlight personal recommendations of items that exist within the browse tree. For example, an item may be selected from the tree for personal recommendation using a collaborative filtering, content-based filtering, or other recommendations algorithm, and automatically featured at some or all of the categories in which the item falls. Alternatively, the criteria and methods used to generate personal recommendations may simply be incorporated into the algorithm for generating item popularity scores.