1. Field of the Invention
This invention relates generally to graphical user interfaces (GUIs). More specifically, this invention relates to an apparatus and method for graphically displaying results of a search conducted on an information network such as the Internet, local and remote databases of content providers, etc.
2. Description of the Related Art
A significant development in computer networking is the Internet, which is a sophisticated worldwide network of computer systems. A user that wishes to access the Internet typically does so using a software program known as a web browser that is hosted on a personal computer or other data processing device that is capable of executing the web browser program and being connected to the Internet. A web browser uses a standardized interface protocol, such as HyperText Transfer Protocol (HTTP), to make a connection via the Internet to other computers known as web servers, to receive user commands to operate certain browser functions and/or to request information from the Internet, and to receive information from the web servers that is presented to the user, typically on a display device such as a monitor.
An ever-increasing amount of information is available on the Internet and other information databases (collectively referred to as information networks). A query to an information network requires a textual specification based on keywords and logical operators between keywords. In most instances, the query returns only the results, which may not be very useful when the number of results returned is much larger than that which can be viewed and manipulated on a screen.
When performing a search, it is typical that a search strategy will be used in order to find the desired information. Most search strategies are premised on attaining a reasonable number of items that satisfy a search criteria. Typically, a query is comprised of keywords (i.e., search terms) connected together via logical and/or proximity operators. Logical operators are used to include or exclude items in a set whereas proximity operators are used to identify items having keywords that are a predetermined distance apart, such as within 10 words, in the same sentence, or that are adjacent. Once a query is made and executed, a list of items satisfying the criteria of the query is presented to the user. The user can then either view one or more items in the list, or if the list is large, modify the search to reduce the number of items in the list.
Data navigation is the process of viewing different dimensions, slices, and levels of detail of a multidimensional database. In a typical list of search results from an information network, documents or other items are listed in descending order based on a relevancy value. The relevancy value for each document is based the number of times the keywords are found in the document. A user must still sort through the list sequentially to view other characteristics of the documents, such as size and date, which may also help determine a document""s relevancy. Thus it is desirable to provide a data navigation tool which allows the user to view, sort, and navigate search results according to several different data and relevant characteristics.
One technique for sorting lists is known as data clustering, which is the process of dividing a data set into mutually exclusive groups such that the members of each group are as xe2x80x9cclosexe2x80x9d as possible to one another, and different groups are as xe2x80x9cfarxe2x80x9d as possible from one another, where distance is measured with respect to all available variables. There are several models for data clustering, e.g., K-means clustering, self-organizing feature maps, the neural gas algorithm, and complexity optimized vector quantization.
In the K-means procedure, for example, suppose a set of feature vectors x1, x2, . . . , xn are from the same class or subset, and that they fall into k compact clusters, k less than n. Let m; be the mean of the vectors in cluster i. If the clusters are well separated, a minimum-distance classifier can be used to separate them. That is, s is in cluster i if ∥x-mi∥ is the minimum of all the k distances. Thus, the k-means procedure partitions the n examples into k clusters so as to minimize the sum of the squared distances to the cluster centers. The results depend on the value of k, which can be any value from 2 to n. When k=n, the procedure is known as the nearest neighbor classifier.
A method and apparatus for representing and navigating search results from a database on a computer system. A graphical user interface is generated to receive user input including a data source to search for information, and a query indicating information which is desired from the data source. The user input is transmitted to the data source, the search is performed and information responsive to the query resulting from the search is received from the data source. The search results include characteristics of the responsive information. The responsive information is clustered into a plurality of groups based on selected characteristic information and means are provided to allow the user to select at least one group of the responsive information to be displayed.
The responsive information includes a list of documents containing information related to the query. The graphical user interface includes a first display portion showing the plurality of groups of characteristic information available for the user to select, and a second display portion showing the list of documents in the responsive information.
In one embodiment, when the user selects one or more groups, the documents displayed in the second display portion belong to the group(s) selected by the user. When a group is selected, it is separated into a plurality of subgroups based on the range of the characteristic information for the selected group. The first display portion is updated to show the plurality of subgroups.
In another embodiment, each group is separated into a plurality of subgroups based on the range of the characteristic information for each group. The first display portion shows the plurality of subgroups, which may be color coded to differentiate the subgroups. Similarly, the list of documents in the second display portion may be correspondingly color coded to the color code in the first display portion.
In another embodiment, a server may be used to transmit data between the client computer system and the data source. In this configuration the server includes program instructions for separating the documents into the plurality of groups based on selected characteristic information.
In another embodiment of the present invention, additional information may be displayed based on the group of responsive information selected by the user.
In another embodiment of the present invention, the first display portion includes a stratum showing the subgroups of the documents. When the user selects one or more subgroups, another stratum showing the subgroup of the responsive information is displayed. The responsive information in the second display portion is based on the subgroup selected by the user.
Another feature of the present invention allows the user to select a document to be displayed for the user to examine its contents.
Another feature of the present invention allows the user to re-arrange the order in which the list of documents in the second display portion are displayed.
The foregoing has outlined rather broadly the objects, features, and technical advantages of the present invention so that the detailed description of the invention that follows may be better understood.