This invention relates to data profiling.
Stored data sets often include data for which various characteristics are not known beforehand. For example, ranges of values or typical values for a data set, relationships between different fields within the data set, or functional dependencies among values in different fields, may be unknown. Data profiling can involve examining a source of a data set in order to determine such characteristics. One use of data profiling systems is to collect information about a data set which is then used to design a staging area for loading the data set before further processing. Transformations necessary to map the data set to a desired target format and location can then be performed in the staging area based on the information collected in the data profiling. Such transformations may be necessary, for example, to make third-party data compatible with an existing data store, or to transfer data from a legacy computer system into a new computer system.