An embodiment of the present invention relates to automatic ontology generation, and more specifically, to automatic ontology generation using a semantic network.
An ontology is a formal naming and definition of the types, properties, and interrelationships of the entities that exist for a particular domain.
Important business information (such as customer information) is often stored repeatedly in multiple repositories within enterprises. For different types of projects, data stored in many redundant copies of repositories needs to be found, harmonized and moved into a single target system. Examples of such projects may include data warehousing, application consolidation towards System Applications Products (SAP) applications or Master Data Management (MDM).
Master Data Management addresses customer data duplication by finding a most important subset of data, called master data, and consolidating it in a single repository or group of connected repositories. Large enterprises have hundreds of source systems with data that needs to be fed into a common target system such as a Master Data Management system. The process of analyzing shape and meaning of data in legacy systems and mapping them to a common ontology requires very substantial manual efforts that may be error prone. There is thus a need to automatically build a common ontology for these projects for the domain of interest (e.g. master data, transactional data, etc.).
Building an ontology for a MDM repository conventionally starts with interviewing business stakeholders to determine general concepts and types of entities present in source systems. Once the high level concepts have been determined, the typical approaches for constructing detailed ontology are as follows.                Manually inspecting source system metadata, finding commonalities between business concepts and their properties, and manually constructing the desired ontology for the particular set of source systems.        Starting with a default ontology that includes most frequently used entity types and attributes, and customizing it for a particular implementation by adding new entity types and properties, and renaming a default one to represent business terms.        Starting with a detailed ontology of all entity types and attributes for specific industry (for example, banking models, insurance models) and customizing it.        
All of these approaches require a subsequent step of mapping the final ontology back to source systems, so that the new ontology can be used to extract data from the source to move it to a common target; for example, from many sources with master data into a centralized MDM repository.
Manually inspecting hundreds of source systems, understanding the meaning of the data in the systems, and manually mapping them to a target ontology is very time consuming. As a result, most of the enterprises cannot afford to introduce MDM systems on an enterprise scale, or would only implement it for the small subset of systems.