Metadata is data about data. That is, metadata defines or describes primary data. For example, metadata are the field names of a database table, mappings between or within database tables, legends of database tables, hierarchies of database tables, joins associated with database tables, and the like. Of course, metadata is not restricted to database applications, because virtually all electronic applications consume primary data that is associated metadata. For example, an electronic mail (email) message may have a time of day or calendar day indicating when a message was sent or created, and this time of day or calendar day can be viewed as metadata.
In some cases, the metadata actually drives the processing of an application. For example, in database applications the metadata defines how and where primary data is acquired. Essentially, the metadata is populated as values (e.g., table names, keys, field names, etc.) within a search operation, such as an SQL search. These values then drive the search and the corresponding results (primary data) produced by the search.
Many organizations have large databases or data warehouses with a variety of database tables and field names (metadata). This is particular true for retail organizations that engage in Customer Relationship Management (CRM), where relationships between customers, products, services, stores, and the like are maintained for the entire organization. These organizations create, manage, and maintain large amounts of primary data and metadata within their databases or data warehouses. Maintaining and supporting such large amounts of primary data and metadata can be daunting.
In fact, when these organizations create new mappings or views into their data store for purposes of mining or reporting new desired features or trends, the development time associated with this exercise becomes time and resource intensive. To do this, typically, business analysts and data base administrators team up to create the new mappings or views into the data warehouse. In some instances, the hierarchies and mappings which need to be produced are themselves voluminous and complex. Once development is done, there is no efficient way for the analysts and database administrators to validate the new mappings, short of manually generating each possible combination of searches that the new mappings may generate with the metadata and then processing each possible search.
This manual construction of possible searches is often times not feasible and not practical given limited time constraints, limited human resources, and limited processing resources. Consequently, errors in defining the metadata often go undetected until a user attempts an operation that generates an invalidate search. When this occurs, an entire problem resolution process is followed until the error is properly located and fixed. This iterative process of problem resolution is not an ideal or desired situation, but is one that is conventionally deployed for large metadata applications, particularly applications associated with large data warehouses.
Therefore, there is a need for improved techniques for automatically validating metadata.