1. Technical Field
The embodiment herein generally relates to data management systems and methods and particularly relates to data optimization. The embodiment herein more particularly relates to a system and method for modeling relevant data by identifying relationship between various entities.
2. Description of the Related Art
The key challenge in managing data today is to help users to locate relevant actionable information quickly and easily. Current methods of information storage and retrieval require users to sift through many data sources before arriving at a solution.
Another aspect of the data management challenge is the proliferation of data sources. Even in regulated environments like a corporation, the number of digital data sources has increased over the past two decades. Various attributes of the same information is generally present in different forms in different data sources. For example, information about a customer will be present as billing details in the finance database, as proposals and project documents in document management systems and conversations/updates on emails, chat, enterprise wikis and blogs and the like. The current methods for managing data do not provide the user a comprehensive view of the activity on a piece of information and also do not derive any actionable insights from the information.
In the existing techniques, integrating data from any two systems requires a custom-made middleware, as it is impossible for the system to understand the content of the participating databases well enough to perform the required integration automatically. The use of a shared ontology to enable semantic interoperability of existing databases and other software is gaining acceptance. It is possible to enable communications between two systems by mapping the semantics of independently developed components to concepts in ontology. The term ontology refers to a conceptual model describing the things in an application domain encoded in a formal, mathematical language.
There exists a technique in which, a web of data referred as semantic web that can be processed by machines. This technique requires that in a given set of data, all uniquely identifiable entities be understood, all relationships between entities be identified and described as ontology and then data be captured in the RDF (Resource Description Framework) format before semantically rich information retrieval is possible.
However, the above explained approach requires significant pre-processing of data before it is ready for semantic information retrieval. Identifying URIs and describing ontology for even a well-defined environment is extremely tedious and expensive. Also creating a generic semantic web incorporating all the digital data in the world is impossible with this approach. Further, creating semantic webs for well-defined and specialized domains such as pharmaceuticals or law is a time-consuming, expensive and effort-intensive activity.
Besides the obvious lack of scalability, cost effectiveness and versatility, the current approach to semantic information management suffers from the need for perpetual high maintenance. Since the RDF method requires a top-down pre-determinate ontology, any changes in data or addition of new data with hitherto undefined relationships, need to be captured manually thus adding to the cost of maintaining the semantic web.
Hence there is a need to provide a data management method and system for modeling data by enhancing information relevancy. There is also a need for a data management system to provide highly contextual information sources to a user. Further there exists a need for data management system and method which involves minimal cost and less maintenance.
The abovementioned shortcomings, disadvantages and problems are addressed herein and which will be understood by reading and studying the following specification.