This invention relates to methods of and apparatus for refining descriptors, for example such as are used for retrieving data items from databases.
A major obstacle to the efficient retrieval of data is the way they are indexed (i.e. to select descriptors or keywords). Currently there are two common ways of indexing:
1. The use of an automatic indexing tool to extract words from text documents or recognize forms and elements in images, videos, etc. This is based on artificial intelligence (AI) techniques and has the limits that this technology offers.
2. One or more people does the indexing manually after a close analysis of the data This is usually accurate but reliant on the vocabulary of the indexer and their perception of the data (It may be very subjective for images, for instance). It is also time-consuming.
Both of these techniques provide a set of indexing keywords or descriptors which are static, and which very often belong to a vocabulary that is inconsistent and limited. However, people querying the system in effect provide possible keywords in their queries. The keywords in the queries may not be existing descriptors but they are relevant to the data. Currently this information is left unused and forgotten by the system once the user quits the system. As a result, if the indexing keywords are inappropriate, nothing can be done to improve them even if some people may provide good indexing terms as they search.
If the terminology commonly used changes over time (for example, one technical term becomes superseded by another), then it becomes necessary to redo all the indexing which is undesirable, especially as databases become bigger and bigger.
Thus there is a need to describe data so that they will be more easily searchable by the majority of the community.
According to one aspect of the invention, a method of indexing data items to enable retrieval thereof comprises: (1) storing the data items in a complete form; (2) storing association relationships between the stored data items and descriptors associated with the stored data items; (3) receiving a search request from a user for selection of stored data items, wherein the request incorporates at least one descriptor; (4) sending the user a search result including a summary form of the stored data items selected in accordance with the search request; and (5) modifying the stored association relationship between the stored data items and the stored descriptors in response to the user responding to the sent search result by requesting the complete form of a selected data item in a search result. The modification includes adding and/or removing at least one descriptor to the stored association relationship for the selected data item for which the complete form was requested.
Another aspect of the invention relates to a method of indexing data items in a central facility accessible by plural users, wherein the data items being indexed enable retrieval thereof. The method comprises: (1) storing the data items in a complete form; (2) storing association relationships between the stored data items and descriptors associated with the stored data items; (3) receiving search requests from plural users for selection of stored data items, wherein the requests incorporate at least one descriptor associated with each data item; (4) sending users search results, wherein each search result includes a summary form of the stored data items selected in accordance with the search request; and (5) modifying the stored association relationship between the stored data items and the stored descriptors in response to the user responding to the sent search result by requesting the complete form of a selected data item in a search result. The modification includes adding and/or removing the at least one descriptor to the stored association relationship for the selected data item for which the complete form was requested.
A further aspect of the invention relates to a search facility for retrieving data items in response to user requests, wherein the search facility comprises: (1) a store for (a) the data items and (b) association relationships between the stored data items and descriptors associated with the stored items; (2) a receiver for search requests from users of the search facility, wherein each search request includes a user selection of at least one of the stored data items and at least one descriptor associated with the user selected at least one stored data item; and (3) a processor coupled to the receiver for retrieving from the store the users"" search results in response to the search request by the users. The retrieved search results include a summary form of the stored data items selected in accordance with the search request. The processor is arranged for modifying the stored association relationship between the stored data items and the stored descriptors in response to the user responding to the sent search result by requesting the complete form of a selected data item in a search result. The modification includes adding and/or removing the at least one descriptor to the stored association relationship for the selected data item for which the complete form was requested.