1. Technical Field
Embodiments of the present invention relate generally to a data integration support infrastructure, and in particular to a computer implemented method, data processing system, and computer program product for utilizing a unique identifier catalog system and a Resource Description Framework (RDF) system to uniquely identify data within an enterprise and query metadata to discover data relationships within the enterprise.
2. Description of the Related Art
Data within large organizations is typically organized vertically in what are commonly called silos. A silo is an information system in an organization comprising one or more repositories which does not directly share data with other related systems in the organization. Each silo assigns and maintains data identifiers that are only guaranteed to be unique within the silo, and perhaps only unique within a specific data store. Data may migrate from one data store to another as new systems within a silo are brought online and old ones retired, or as data ages and moves from active to archived data stores. Each information silo has its own unique process and business rules for data access and retrieval. As a consequence of an information silo not being capable of reciprocal operation with other related management systems in the organization, the ability to access and retrieve data across the enterprise is currently limited.
There are very few tools available which provide access to data across silos. One such tool, Information Integrator for Content (II4C), is a federated system which provides data integration in a content management environment. Information Integrator for Content provides simultaneous and federated search access to structured database records and unstructured content. However, implementation of existing tools such as Information Integrator for Content to access data across silos can also be problematic. These existing tools comprise complex, cumbersome components that are extremely sensitive to modification of backend systems and would require extensive configuration and maintenance of the configuration to function properly. II4C requires nicknames to map between database as well. Federated databases provide some functionality but have problems when the same object or attribute is called by different names (e.g. “fnam” or “firstname” or “givenNAme” all being the same thing).