The present invention relates to search, retrieval, and organization of data from large data spaces such as the contents of CD ROMS, electronic program guides, the Internet, etc.
The vast array and amount of information available in CD-ROMS, the Internet, television programming guides, the proposed national information infrastructure, etc. spur the dream of easy access to many large information media sources. Such increased access to information is likely to be useful, but the prospect of such large amounts of information presents new challenges for the design of user interfaces for information access. For example, Internet users often struggle to find information sources or give up in the face of the difficulty of constructing search queries and visualizing the results of queries. Straight text lists such as provided by electronic program guides, Internet search engines, and text search tools such as Folio(copyright), are tedious to work with, often hard to work with, and, because of the rather monotonous look, rather tiring to look at for long periods of time.
There are to major components to searching databases: filtering so irrelevant information is excluded, and sorting the filtered results by some priority schema. For example, an Internet search engine such as Google(copyright) uses a text query to filter and sort records in its database representing entry points to the World-Wide-Web. It uses certain implicit criteria such as an implied vote xe2x80x9ccastxe2x80x9d by pages that link to the candidates retrieved by the query (That is, pages that are linked to by more other pages, have more xe2x80x9cvotesxe2x80x9d). Google also analyzes the pages that cast the votes and gives greater weight to pages that receive more votes by other pages.
Tools such as Google and most other database retrieval tools accept search queries in the form of text with connectors and results are presented in the form of lists sorted by some specific lump criterion which might be an operator involving multiple criteria (such as sort by A, then by B, etc).
Briefly, a graphical user interface (UI) provides a convenient and intuitive mechanism for interacting with large databases. The UI provides a three-dimensional metaphor for the processes of searching a data space and for viewing results. The UI also seamlessly incorporates various search elements, such as implicit and explicit user profiles, into the metaphor. In one embodiment, the search criteria are shown as strings of beads in a three-dimensional scene, each bead representing a criterion and each string representing a different category. For example the criteria, drama, action, suspense, and horror may be included in a category of genre. Criteria are selected to form a query by moving corresponding beads to a query string. User preference profiles can be constructed in the same way. Profiles can be saved and represented as bead strings that can be used in further interactions in the same manner as criteria beads. Results are displayed in a three-dimensional scene also. The accuracy of the match between retrieved records and the query correspond to the placement of results, also represented as beads, along the Z-axis of the scene.
The UI design addresses various problems with user interaction with database search devices in the xe2x80x9clean-backxe2x80x9d environment. (In the xe2x80x9clean backxe2x80x9d situation the user is being entertained and relaxes as when the user watches television, and in the xe2x80x9clean-forwardxe2x80x9d situation the user is active and focused as when the user uses a desktop computer.) For example, the invention may be used to interact with electronic program guides (EPGs) used with broadcast television. In such an application, the UI may be displayed as a layer directly on top of the recorded or broadcast program or selectively on its own screen. The UI may be accessed using a simple handheld controller. In a preferred embodiment, the controller has vertical and horizontal scroll buttons and only a few specialized buttons to access the various operating modes directly.
The UI generates three environments or worlds: a search world, a profiling world, and an overview world. Assuming an EPG environment, in the search world, the user enters, saves, and edits filtering and sorting criteria (time of day, day of week, genre, etc.). In the profiling world, the user generates and modifies explicit (and some types of implicit) user profiles. Explicit profiles are the set of likes and dislikes a user has entered to represent his preferences. Each can be selected from lists of criteria such as genre (movies, game shows, educational, etc.), channel (ABC, MTV, CSPAN, etc.), actors (Jodie Foster, Tom Cruise, Ricardo Bernini, etc.), and so on. In the overview world, the user views and selects among the results of the search, which is a result of the sorting, filtering, and profiling information.
The invention may be used in connection with various different searching functions. For example, in a preferred embodiment designed around EPGs, there are three basic searching functions provided: (1) Filtering, (2) Filtering and/or sorting by explicit profile, and (3) Sorting by implicit profile. These are defined as follows.
(1) Filteringxe2x80x94A set of criteria that defines the set of results to be displayed. These criteria choose exactly what records in the database will be chosen and which will be excluded from the overview world display.
(2) Filtering and/or sorting by explicit profilexe2x80x94A user is permitted to specify likes or dislikes by making selections from various categories. For example, the user can indicate that dramas and action movies are favored and that certain actors are disfavored. These criteria are then applied to sort the records returned by the filtering process. The degree of importance of the criteria may also be specified, although the complexity of adding this layer may make its addition to a system less worthwhile for the vast majority of users.
As an example of the second type of system, one EP application (EP 0854645A2) describes a system that enables a user to enter generic preference such as a preferred program category, for example, sitcom, dramatic series, old movies, etc. The application also describes preference templates in which preference profiles can be selected, for example, one for children aged 10-12, another for teenage girls, another for airplane hobbyists, etc. This method of inputting requires that a user have the capacity to make generalizations about him/herself and that these be a true picture of his/her preferences. It can also be a difficult task for common people to answer questions about abstractions such as: xe2x80x9cDo you like dramas or action movies?xe2x80x9d and xe2x80x9cHow important is the xe2x80x98dramaxe2x80x99 criteria to you?xe2x80x9d
(3) Sorting by implicit profilexe2x80x94This is a profile that is generated passively by having the system xe2x80x9cobservexe2x80x9d user behavior. The user merely makes viewing (recording, downloading, or otherwise xe2x80x9cusingxe2x80x9d) choices in the normal fashion and the system gradually builds a personal preference database by extracting a model of the user""s behavior from the choices. This process can be enhanced by permitting the user to rate material (for example on a scale of one to five stars). The system uses this model to make predictions about what the user would prefer to watch in the future. The process of extracting predictions from a viewing history, or specification of degree of desirability, can follow simple algorithms, such as marking apparent favorites after repeated requests for the same item. It can be a sophisticated machine-learning process such as a decision-tree technique with a large number of inputs (degrees of freedom). Such models, generally speaking, look for patterns in the user""s interaction behavior (i.e., interaction with the UI for making selections).
An example of this type of profile information is MbTV, a system that learns viewers"" television watching preferences by monitoring their viewing patterns. MbTV operates transparently and builds a profile of a viewer""s tastes. This profile is used to provide services, for example, recommending television programs the viewer might be interested in watching. MbTV learns about each of its viewer""s tastes and uses what it learns to recommend upcoming programs. MbTV can help viewers schedule their television watching time by altering them to desirable upcoming programs, and with the addition of a storage device, automatically record these programs when the viewer is absent.
MbTV has a Preference Determination Engine and a Storage Management Engine. These are used to facilitate time-shifted television. MbTV can automatically record, rather than simply suggest, desirable programming. MbTV""s Storage Management Engine tries to insure that the storage device has the optimal contents. This process involves tracking which recorded programs have been viewed (completely or partially), and which are ignored. Viewers can xe2x80x9clockxe2x80x9d recorded programs for future viewing in order to prevent deletion. The ways in which viewers handle program suggestions or recorded content provides additional feedback to MbTV""s preference engine which uses this information to refine future decisions.
MbTV will reserve a portion of the recording space to represent each xe2x80x9cconstituent interest.xe2x80x9d These xe2x80x9cinterestsxe2x80x9d may translate into different family members or could represent different taste categories. Though MbTV does not require user intervention, it is customizable by those that want to fine-tune its capabilities. Viewers can influence the xe2x80x9cstorage budgetxe2x80x9d for different types of programs. For example, a viewer might indicate that, though the children watch the majority of television in a household, no more than 25% of the recording space should be consumed by children""s programs.
Note that search criteria, and implicit and explicit profiles, may produce reliability or ranking estimates for each proposed record in the searched database rather than just xe2x80x9cyesxe2x80x9d and xe2x80x9cnoxe2x80x9d results for each candidate record in the database. A search query can be treated as providing criteria, each of which must be satisfied by the search results. In this case, if a query contains a specified channel and a specified time range, then only records satisfying both criteria will be returned. The same search query could be treated as expressing preferences in which case, records that do not satisfy both criteria could be returned, and, instead of filtering, the records are sorted according to how good a match they are to the criteria. So, records satisfying both criteria would be ranked highest, records satisfying only one criterion would be ranked second-highest, and records satisfying neither criterion would be ranked last. Intermediate ranking could be performed by the closeness of the record criterion to the query or profile criterion. For example, in the example above, if a record is closer to the specified time range, it would be ranked higher than a record that is further in time from the specified time range.
In the case of implicit profiles, there may not be any criteria at all in the sense that one could show how high each genre, for example, is ranked. If, for example, a neural network-based predicting engine were used to sort the records of the database, there is no clear way to expose the criteria weighting that is used to make the decisions, at least for an easy-to-use system. However, some simpler machine learning techniques may also be used for producing and implementing implicit profiles. For example, the criteria appearing in selected records (or records ranked highly as highly desirable) can be scored based on the frequency of criteria hits. For example, in an EPG, if all the programs that are selected for viewing are daytime soaps, the soap genre and daytime range would have a high frequency count and the science documentary genre would have zero hits. These could be exposed so that the viewer can see them. In the user interface embodiments described below, in which profiles are edited, the user may edit such an implicit profile because it is based, on specific weights applied to each criterion. A user can remove the criterion from the profile, change the weighting, etc. The latter is only an example of an implicit profiling mechanism that provides a clear way for the user to modify it. Other mechanisms may also provide such a scheme; for example the system need not be based only on frequency of hits of the user""s selections.
Construction of the queries for filtering and preference application is preferably done with three dimensional visual graphics to facilitate the organization of information and to allow users to manipulate elements of a scene (xe2x80x9ctokensxe2x80x9d) that represent data records, search and sort criteria, etc. In a preferred UI, the tokens take the form of beads. Categories are represented as strings or loops of beads. When a preference filter is constructed, specific choices (beads) are taken from a category string and added to a search string or bin. The beads, strings, and bins are represented as three-dimensional objects, which is more than just for appearances in that it serves as a cue for the additional meaning that the third dimension provides: generally an object""s proximity to the user represents its relative ranking in the particular context.
Where the strings represent criteria, the ranking of criteria in each category may correspond to the frequency with which the criteria are used by the user in constructing queries. So, for example, if the user""s searches always include the daytime time range, the bead or beads corresponding to this time range would be ranked higher. Alternatively, the criteria may be ranked according to selected records, rather than by all the records (or at least the most highly ranked ones) returned by searching.
One or more categories may actually be constructed of words, for example keywords, that appear in a large proportion of the chosen programs or a large proportion of the hits returned by the user""s queries. This makes sense because requiring the keyword category to contain every conceivable keyword would be awkward. Extracting the significant keywords from the descriptions of chosen records and/or from records returned by the queries based on frequency of occurrence or a variation thereof, makes the number of possible keywords easier to handle and easier to select. Preferably, the keyword list should be editable by the user in the same fashion as described in detail with respect to the editing of profiles elsewhere in the specification. To construct a keyword list based on frequency of use data, the system could start with no keywords at all. Then, each time the user enters a query, the returned results could be scanned for common terms. The titles, descriptions, or any other data could be scanned and those terms that occur with some degree of frequency could be sorted in a keyword list. The keywords in the list could each be ranked based on frequency. Alternatively, the keywords in the list could each be ranked based on frequency weighted by the context in which the keyword appeared. For example, a keyword in a title might receive a lower rank than a keyword in a description. A keyword that is a direct object or subject in a grammatical parsing of a sentence in a description might receive a higher ranking than indirect objects, etc. Instead of extracting keywords from the returned records of a search, the keywords could be extracted from only the records selected for use. For example, only programs that are chosen for viewing or recording are actually used to form the keyword list in the manner described. Alternatively both selections and returns of queries could be used, but the keywords in the selected records could be weighted more strongly than keywords in other returned records.
The overview world presents the results of filtering and sorting criteria in a visually clear and simple way. Preferably, a three-dimensional animation is shown with three-dimensional tokens representing each record. Again, the (apparent) closeness of the token to the user represents the prediction of how much the user, according to the selections that are active, would prefer the item identified by the record. That is, proximity, initially, represents goodness of fit. In one example of this, the bead strings, each bead representing a record, are shown axially aligned with the string with the best fits being arranged closest to the user and the others receding into the background according to their degree of fit. The user can advance in an axial direction to search through the results as if walking through a tunnel. A pointer can be moved among the beads to select them. This causes additional information about each to be exposed.
The implicit and explicit user profiles are invoked by adding them to the search queries (the bin or string) just as done with other choices. The effect of adding the profile is to have results sorted according to the preferences. Explicit user profiles are generated in the same way.
The invention will be described in connection with certain preferred embodiments, with reference to the following illustrative figures so that it may be more fully understood. With reference to the figures, it is stressed that the particulars shown are by way of example and for purposes of illustrative discussion of the preferred embodiments of the present invention only, and are presented to provide what is believed to be the most useful and readily understood description of the principles and conceptual aspects of the invention. In this regard, no attempt is made to show structural details of the invention in more detail than is necessary for a fundamental understanding of the invention, the description taken with the drawings making apparent to those skilled in the art how the several forms of the invention may be embodied in practice.