1. Field of Invention
The present invention relates generally to database management and, more particularly, to a framework for representing metadata in a common access repository.
2. Description of the Background
A data warehouse seeks to gather information from disparate sources, organize it, and make it available to appropriate people within an organization. Just as any other kind of warehouse needs to keep an inventory of its holdings, a data warehouse needs to keep track of what data it is currently holding along with the pedigree of that data. A metadata repository can supplement a data warehouse in that it gives users additional information, in the form of metadata, about information assets stored in the data warehouse such as where the information came from, which rules were used in creating the information, and what the information elements mean.
A repository is an application that manages a wide variety of metadata from many sources, such as database management system (DBMS) catalogs (e.g. Oracle(copyright)), development tools like ERwin(copyright) and Vitria(copyright), and programming language specific environments such as mainframe COBOL. More specifically, a metadata repository facilitates and supports the storage, use, and retrieval of metadata collected from various data warehouse applications, development projects, and legacy applications and make that information available in an appropriate format to other tools. Repositories manage this metadata independently of other environments, without constraints to specific tools or databases. A repository differs from a xe2x80x9cdata dictionaryxe2x80x9d in that respect because the tools and databases associated with a data dictionary, such as Oracle(copyright) system catalog and Oracle(copyright) Designer 2000, manage only the data dictionary information and nothing else.
A repository may contain three basic types of metadata: technical, business and environmental. Technical metadata (or xe2x80x9cback room metadataxe2x80x9d) describes how business data are mapped to an implementation structure. For example, technical metadata describes how a high level entity relationship (E/R) model is mapped onto a relational database management system (RDBMS) schema. Metadata of this type may include, for example, physical data models, copybooks, data definition language (DDL), or system catalogs. Business metadata (or xe2x80x9cfront-room metadataxe2x80x9d) may describe business concepts. Metadata of this type may include logical data models, business rules, transformation rules, and glossaries. Environmental metadata includes statistics about a metadata object. Environmental metadata may include, for example, the date the scan was performed into the repository or the date the metadata object last changed, or what scan brought the object into the repository. This type of metadata may also be used to track statistics about levels of confidence that a particular scorecard would provide as it rates the quality of a data element for accuracy based on some predefined rules. For example, users may resort to environmental metadata to learn how frequently a particular table is updated, or when the last update or load occurred for a table.
The first major-design issue in developing a metadata repository is to develop a process that transforms the metadata into information about the organization. A metadata framework is the final stage of the transformation process that makes the information useful from a user perspective. Metadata by itself, without such a framework, is simply a collection of facts about a process or an application that does not carry much meaning. Thus, a database model or metadata framework is needed that enables organizations to create a common access repository. The framework needs to provide sufficient flexibility to model data stored throughout the enterprise, including data stored on legacy systems. The framework also needs sufficient flexibility to model data in a fixed state and in transition. Such a framework should also be coupled with a simple and user-friendly interface. The framework must also provide some indication of the relative importance of a particular state. Finally, the framework should contain common definitions of terms such as xe2x80x9ccustomer,xe2x80x9d xe2x80x9cpayment,xe2x80x9d and xe2x80x9cproduct.xe2x80x9d
The present invention is directed to a framework for representing metadata in a common access repository. According to one embodiment, wherein the metadata is loaded into the repository from a source system, the framework includes a first scanning module for scanning the source system for first set of metadata that describes a first state; a loading module for loading the first set of metadata into the repository; a first state description for the first set of metadata in a fixed state; and a user interface for accessing the metadata.
The framework of the present invention may be used to represent metadata in a common access repository. For example, the present invention may be used in conjunction with data warehouse or enterprise level database services, which store information assets typically without any information about those assets. The present invention provides a framework for representing metadata that describes, for example, where the data came from, which rules were used in creating the data, and what the data elements mean. Thus, the present invention helps derive more value from existing information assets by exploiting metadata.
In addition to database services, benefits of the present invention may also be realized in business applications. For example, the present invention allows such business users to proactively assess the impact of a change throughout an organization by incorporating business rules, data structures, programs, and other organizational information into the metadata architecture.
These and other benefits of the present invention will be apparent from the detailed description below.