1. Field of the Invention
This invention relates to managing query results. More particularly, the invention relates to a method, system, computer program product, server, and interface for managing query results.
2. Discussion of Related Work
FIG. 1 shows an example of a computer display 100. The computer display has a status area 110, various program icons 120, an active window 130, a title bar 140 for the active window 130, a toolbar 150 with tool buttons 160, and a window area 200 for the active window 130.
A user interface appearing in the window area 200 may accept inputs from a user relating to query search terms. The query search terms may be provided as an input to an Internet search engine or the like. Typically, the Internet search engine will respond to a query by returning between 0 and several million query results.
FIG. 2 shows an example of how query results are typically presented. The query results appear in the window area 200 of the active window. In the particular example shown in FIG. 2, there is an indicator of the number of hits. The indicator 955 shows that there are 382 results in the query results corresponding to the query search terms provided as a user input. Each query result is provided as a pointer 950. The pointer 950 may be an HTML link. The pointer 950 may also include supplemental information, such as information relating to the content of the Web page indicated by the pointer. There also may be a relevance ranking (not shown). Whether or not the relevance ranking is shown, the relevance ranking is typically used to order the presentation of the query results.
Thus, the conventional approach to managing query result complexity is to determine a relevance ranking for each of the query results, and to order the query results based on the relevance ranking. Where two query results have an identical relevance ranking, it is conventional to present the results in alphabetical order.
In the first related application xe2x80x9cMANAGEMENT OF QUERY RESULT COMPLEXITY USING WEIGHTED CRITERIA FOR HIERARCHICAL DATA STRUCTURINGxe2x80x9d described above, there is described an improved and novel tool for the collection and ordering of data from heterogeneous sources, including structured and unstructured data.
FIG. 3 shows a set of user interface objects which may be used to obtain user inputs for the improved and novel tool described in this related application. In particular, a plurality of user activatable regions 210 permit a user to enter various categories for a search. In the example shown in FIG. 3, the categories xe2x80x9canimalxe2x80x9d, xe2x80x9cusagexe2x80x9d, and xe2x80x9cmaterialxe2x80x9d have been entered into the category entry regions 210.
The user activatable regions 220 include slider bars 230 which are used to provide a priority or weighting of the different categories entered into the category entry regions 210. In the example shown in FIG. 3, the further to the right a slider bar 230 is placed, the higher the priority of the corresponding search parameter.
The user activatable regions 240 are value entry regions. These regions are used to provide specific values for searching, each value corresponding to one of the categories. In the example shown in FIG. 3, the categories xe2x80x9costrichxe2x80x9d, and xe2x80x9ceaglexe2x80x9d have been entered in a value entry region 240 that corresponds to the category entry region 210 containing the category xe2x80x9canimalxe2x80x9d.
Together, the information included in the category entry regions 210, the weighting regions 220, and the value entry regions 240 may be referred to as semantic structuring information. In other words, semantic structuring information includes categories, each with a corresponding category weight, each category having corresponding values. The categories are ordered in a category order based on the corresponding category weight. The categories could be ordered on some further basis, of course, but it is sufficient for the present explanation that the categories be ordered on at least the category weight.
Now an explanation will be given of how the semantic structuring information may be used to represent a set of query results as a tree shaped hierarchy, as in the above-identified co-pending application. This is not a conventional approach, but the subject of another patent application. The explanation of producing a tree-shaped hierarchy from semantic structuring information presented here is not as detailed as in the co-pending application, and is not meant in any way to limit the description presented therein.
FIG. 4 shows a set of 16 query results 900-915. Each of the query results 900-915 represents a Web page that is a member of the query results set for the query represented by the semantic structuring information as shown in FIG. 3. In FIG. 4, the information relating to each of the query results is information that may come from the meta tags, the content of the Web page, or the like.
FIG. 5 shows a larger set of query results 900-931 to be discussed shortly.
FIG. 6 shows a tree shaped hierarchy that may be built for the query results 900-915 of FIG. 4. Some terminology relating to the tree-shaped hierarchy will now be discussed, as the terminology is relevant also to the discussion of the preferred embodiments of the invention.
In FIG. 6, the tree shaped hierarchy has a root node or control node 0. This root node may be referred to as a query root node QRN. The tree also has intermediate nodes 1.0-3.9. The tree also has leaf nodes 4.0-4.15. The query results 900-915 are not part of the tree, but are merely shown for their correspondence to the leaf nodes 4.0-4.15.
The nodes in this tree-shaped hierarchy are arranged in levels. At the very highest level 710 is the root node 0. The root node 0 represents the query itself. The first level 720 of intermediate nodes has intermediate nodes that are connected directly to the root node 0. That is, intermediate nodes 1.0-1.1 have, as their immediately upward node, the root node 0. It may also be expressed by saying that the first level of intermediate nodes has an immediately upward node (IUN) set of just the query root node QRN.
This first level 720 in the tree-shaped hierarchy of FIG. 6 corresponds to the category having the heaviest weight (in this example, the category xe2x80x9canimalxe2x80x9d). While the level corresponds to the category, the intermediate nodes in this tree-shaped hierarchy correspond to the particular values that relate to the category. In this example, node 1.0 corresponds to the value xe2x80x9costrichxe2x80x9d of the category xe2x80x9canimalxe2x80x9d; node 1.1 corresponds to the value xe2x80x9ceaglexe2x80x9d of the category xe2x80x9canimalxe2x80x9d.
The second level 730 of intermediate nodes corresponds to the category that is second according to the category order (i.e. based on the weighting given the category by the user). The category that is second according to the category order is xe2x80x9cmaterial.xe2x80x9d The second level 730 thus corresponds to the category xe2x80x9cmaterialxe2x80x9d. The intermediate nodes in this level correspond to the values that relate to the category xe2x80x9cmaterialxe2x80x9d: xe2x80x9cfeatherxe2x80x9d, xe2x80x9cplumexe2x80x9d, and xe2x80x9cquillxe2x80x9d. More particularly, node 2.0 corresponds to the value xe2x80x9cfeatherxe2x80x9d and so does node 2.3. Nodes 2.1 and 2.4 correspond to the value xe2x80x9cquillxe2x80x9d. Node 2.2 corresponds to the value xe2x80x9cplumexe2x80x9d.
It will be appreciated that nodes 2.0 and 2.3 differ semantically. In particular, node 2.0 represents any hits that relate to the value xe2x80x9cfeatherxe2x80x9d of the category xe2x80x9cmaterialxe2x80x9d and also to the value xe2x80x9costrichxe2x80x9d of the category xe2x80x9canimalxe2x80x9d. Node 2.3 represents any hits that relate to the value xe2x80x9cfeatherxe2x80x9d of the category xe2x80x9cmaterialxe2x80x9d and also to the value xe2x80x9ceaglexe2x80x9d of the category xe2x80x9canimalxe2x80x9d. Likewise, the relevance of node 2.1 is to those hits relating to xe2x80x9cquillxe2x80x9d and xe2x80x9costrichxe2x80x9d, while the relevance of node 2.4 is to those hits relating to xe2x80x9cquillxe2x80x9d and also xe2x80x9ceaglexe2x80x9d.
The absence of a node relating to the value xe2x80x9cplumexe2x80x9d for the category xe2x80x9cmaterialxe2x80x9d connected to node 1.1 indicates that no hits were identified that relate to xe2x80x9cplumexe2x80x9d and also xe2x80x9ceaglexe2x80x9d.
The nodes 2.0-2.4 of the second level 730 of intermediate nodes all have, as an immediately upward node IUN, one of the nodes of the first level 720 of intermediate nodes. In other words, nodes 2.0-2.2 have, as their IUN, node 1.0; nodes 2.3-2.4 have, as their IUN, node 1.1. More generally, the IUN set for the second level 730 is all of the nodes of the first level 720.
The nodes 3.0-3.9 of the third level 740 of intermediate nodes, in like manner, all relate to one of the values (xe2x80x9cdecorationxe2x80x9d, xe2x80x9cornamentxe2x80x9d, xe2x80x9cadornmentxe2x80x9d, or xe2x80x9cembellishmentxe2x80x9d) of the next category in in the category order (i.e., the category xe2x80x9cusagexe2x80x9d). In FIG. 5, xe2x80x9cornamentxe2x80x9d is abbreviated as xe2x80x9cornxe2x80x9d; xe2x80x9cdecorationxe2x80x9d is abbreviated as xe2x80x9cdecxe2x80x9d; xe2x80x9cadornmentxe2x80x9d is abbreviated as xe2x80x9cadoxe2x80x9d; and xe2x80x9cembellishmentxe2x80x9d is abbreviated as xe2x80x9cembxe2x80x9d.
The nodes 4.0-4.15 are all leaf nodes, and each leaf node corresponds to one of the query results 900-915. Each of the nodes 4.0-4.15 has, as its IUN, one of the nodes of the third level 730 of intermediate nodes. To put it another way, the IUN set for the leaf nodes is the set of nodes in the third level 730.
It will be appreciated that the query results 900-915 have information (i.e., meta-tags or the like) that relates to the values requested by the user for each of the categories. For example, as shown in FIG. 4, result 900 relates to xe2x80x9costrichxe2x80x9d, xe2x80x9cfeatherxe2x80x9d, and xe2x80x9cornamentxe2x80x9d. The leaf node 4.0 that points to query result 900 may thus be said to be attributed with this same information. Inasmuch as the value xe2x80x9costrichxe2x80x9d relates to the category xe2x80x9canimalxe2x80x9d, the leaf node 4.0 may be said to have a category-value of xe2x80x9costrichxe2x80x9d. The leaf node 4.0 thus is attributed with (in this example) three category-values, each of which corresponds to one of the categories. The query results 900-915 may have additional meta-information (not shown) as well, although this additional meta-information may be irrelevant to the semantic structuring information provided by the user.
Clearly, node 4.0 is attached in its particular location because it relates to usage=ornament and material=feather and animal=ostrich. This is evident by tracing an upward path of nodes (UPN) through nodes 3.0, 2.0, and 1.0 to the root node 0. The UPN of any node defines, semantically, the relation of the node to the overall query. Another way to put this is to say that, in the tree-shaped hierarchy, it is a requirement that the category-values of a leaf node satisfy an uptree relation. Here, xe2x80x9cuptree relationxe2x80x9d is defined as a relation satisfied when the category-values of a leaf node include at least the values represented by each of the nodes in the path upward to the root node). Thus, the leaf nodes are attached based on an uptree relation.
Yet another way to express this concept of the uptree relation is to describe it in the sense of an upward value set UVS. The UPN of a leaf node passes through nodes that each have a respective node value. The collection of the respective node values for the nodes in the UPN thus may be said to define an upward value set UVS. A leaf node must be attached in the tree-shaped hierarchy so that the values represented in its content include values considered equivalent to at least all of the values in the UVS.
A tree-shaped hierarchy such as that shown in FIG. 6 may be thought of as defining a virtual space. For every given node, the set of nodes consisting of the given node itself, and also the nodes immediately downward of the given node may be thought of as defining a space cube. For example, node 2.1 and nodes 3.1, 3.2, 3.3, and 3.4 define a space cube. Likewise, node 1.0 and nodes 2.0, 2.1, and 2.2 define another space cube. In a space cube, the highest node is the space cube root node and the other nodes in the space cube are space cube leaf nodes. In the space cube having nodes 2.1, 3.1, 3.2, 3.3, and 3.4, e.g., node 2.1 is the space cube root node and nodes 3.1, 3.2, 3.3, and 3.4 are the space cube leaf nodes.
A space cube, although it does not contain nodes above the space cube root node, includes a link to at least the IUN of the space cube root node.
The tree-shaped hierarchy such as that shown in FIG. 6 can advantageously be used, as explained in the above-identified co-pending application, to simplify the navigation of the user through the set of query results. That is, the user can start out being presented with a choice of looking at hits relating to xe2x80x9costrichxe2x80x9d or to xe2x80x9ceaglexe2x80x9d. If the user elects to investigate hits relating to xe2x80x9costrichxe2x80x9d (i.e., node 1.0), he can be presented with the choice of exploring hits relating not just to xe2x80x9costrichxe2x80x9d, but also to xe2x80x9cfeatherxe2x80x9d (node 2.0), xe2x80x9cquillxe2x80x9d (node 2.1), or xe2x80x9cplumexe2x80x9d (node 2.2). As can easily be seen, the user can be presented with query results in a semantically structured hierarchy, based on the user""s own category and value information.
The intermediate nodes of a tree-shaped hierarchy are based on semantic structuring information and form a semantic structure that gives some depth and greater dimension to the otherwise flat conventional presentation of query results exemplified in FIG. 2.
Although a tree-shaped hierarchy such as that shown in FIG. 6 is highly advantageous, there are situations in which more help in managing query results is required. For example, when a query returns thousands or millions of query results, even though the tree-shaped hierarchy helps a user simplify the results to a great extent, the leaf nodes at the lowest level still may have numerous leaf nodes attached to each immediately upward node.
The query results shown in FIG. 5 illustrate this problem. If this set of query results was structured into the tree-shaped hierarchy, the results might be as shown in FIG. 7. In FIG. 7, node 3.1 relates to ostrich, quill, and decoration. The results shown in FIG. 5 include 17 results that relate to ostrich, quill, and decoration. A user navigating through the virtual space to node 3.1 would be presented with 17 results all at once. Even though the situation is far better than that shown in FIG. 2, it still represents quite a bit of information for a user to handle.
In queries with hundreds, thousands, or millions of hits, some additional structuring is necessary. This additional structuring, however, has heretofore been unavailable.
To provide additional structuring and thus overcome the foregoing problems, there is herein described a method for managing query result complexity. The management of query results includes receiving query results and semantic structuring information, and defining a root node corresponding to the query to which the results relate. The management further includes creating a semantically structured tree descending from root node so as to have intermediate nodes generally in layers. The intermediate nodes are arranged in the layers according to the semantic structuring information. The tree has, at a lowest level of each branch, leaf nodes, each of which corresponds to a query result. The leaf nodes are attached to the tree based on an uptree relation. An important aspect of the solution is that the tree has a space cube structure balanced on the basis of a semantic threshold.