This invention relates generally to metadata searching in storage and retrieval systems and, particularly, an enhanced system and method to customize the metadata for different users running on the same computing environment
A common task across many domains is to retrieve information from a repository, e.g., a memory storage device such as a database. The information retrieval system and method becomes more important today given the sheer volume of data in the repository. Usually, the retrieval process starts when the user submits a query, and then the repository management system searches the repository based on keywords in the query to return matched records. The full-text search way performs poorly in the situations where the amount of data is very large. Such repository management systems are also inclined to misunderstand the semantics of keywords when they cover multiple subject areas.
One improvement to the information retrieval today is associating meta-data that is relevant to the user needs with data elements in the repository. Metadata is commonly defined as the data about data. For instance, for a document, metadata may include information such as who wrote the document, when it was published and what especially it discusses etc. All these kinds of information can be described in the metadata of this document. Therefore, the metadata may have clearer semantics and include some category information to organize the data in the repository. Even more, the relationships among different metadata items may be involved to describe more complex semantics. Obviously, the query on metadata is more effective to retrieve appropriate results than the full-text search, especially for some specific areas difficult to apply the full-text search, such as multimedia. However, accompanying with the fast increase on the amount and complexity of metadata, the effective search on metadata also becomes difficult. On the other hand, different users may have different metadata usages in a distributed environment. Some need performance but do not care for inter-concept relationships, e.g., glossaries. Some need rich relationships to guarantee high recall/handle complex queries even if performance may be slower, e.g., asset retrieval, searching for work requests, and assigning work requests to individuals or teams based on capabilities. In other words, there is a trade-off among different objectives from the users' queries, and the infrastructure must be configured to optimize based on user needs. For instance, such objectives need to be optimized:
Performance—How quickly can the data be retrieved
Precision—Of the data that is retrieved, what fraction of it is relevant to the users needs
Recall—What fraction of the relevant data was retrieved.
Previous metadata storing formats that are configured to give performance do not support relationships. Additionally, ontology systems that allow relationships currently do not allow configuring relationships based on different users of the same metadata system.
It would be desirable to provide an optimization for a system and method that addresses the aforesaid various search requirements of different users who may have different metadata usages.
Particularly, it would be desirable to provide a system and method that customizes metadata for different users running on the same infrastructure to attain an effective search on metadata.
Moreover, it would be highly desirable to provide a system and method that supports the customization on different types of relationships in metadata to balance various factors in search according to different users' needs.