1. Technical Field
This invention relates to the field of data processing, and more particularly, to a method of describing information for retrieval and utilization.
2. Description of the Related Art
Conventional search engines allow users to locate data items matching particular search criteria from within a collection of data items. Typically, a search engine matches a user specified query against an index of descriptors. Descriptors, sometimes referred to as metadata, provide a definition or description of data. Descriptors can be associated with data items, and thus, can be used to provide a searchable description for the data items with which the descriptors are associated. For example, the descriptors can be associated with various types of data items such as hardware profiles, data entries, multimedia files, documents, drawings, charts, spreadsheets, software objects, records, Web sites, Web pages, or any other electronic document and/or software component which may be part of a searchable collection.
Data items can be associated with numerous descriptors. Similarly, each descriptor can be associated with numerous data items. Descriptors commonly are specified as a single word or phrase. A single word or phrase, however, often does not relay the essence or provide a complete description of a data item. Associating the data item with more than one descriptor may not convey the degree to which the data item is related to each individual descriptor. In consequence, searching a data collection for data items matching the a particular set of characteristics can be challenging.
User specified queries typically employ selected keywords as search terms. The search terms can be weighted thereby placing more emphasis on particular search terms. Some search techniques implicitly weight search terms by assigning significance based upon the position of the term within a user query. For example, the first term specified in the user query can be assigned the highest significance while the last term of the query may be assigned the lowest significance. Other search techniques allow the user to directly specify the weight of search terms.
Research has shown, however, that users are not particularly skilled at determining the relative importance of query terms. One reason is that users frequently are not aware of the variety of descriptors which exist, or at least the extent of descriptors available, within a given document collection. Further, as most users seek information about unfamiliar subjects, users are likely to be unfamiliar with the terminology most suited and/or most often used in reference to the subject matter being searched. Hence, users may place great weight on terms which are irrelevant, and place little weight on terms which are highly relevant.