1. Technical Field
The present invention relates generally to data processing, and more particularly to automatic consistent sampling to enable improved matching, discovery of primary key-foreign key relationships and value overlaps in databases.
2. Discussion of Related Art
In today's global economy, the ability of an enterprise to efficiently store, update, and use information can be critical to the enterprise's ability to serve its customers and compete in the marketplace. This information is often stored in databases, in the form of database objects such as data sets, tables, indices, or stored queries. The database objects may be generated and/or received from multiple business units, and may be stored in a variety of storage devices located in multiple locations. These storage devices may include relational databases that store the data objects as tables of data. The relationships between data stored in various tables may be constrained by using primary and foreign keys, which establish and enforce links between data stored in multiple tables, thereby linking information together and providing database normalization. Primary and foreign keys may be identified manually (e.g., by a user) or automatically, however in a very large data set it may be a highly resource-intensive activity to identify and resolve foreign key constraints in an automatic fashion.