This description relates to managing data profiling operations related to data type.
Databases or other information management systems often include datasets for which various characteristics may not be known. For example, ranges of values or typical values for a dataset, relationships between different fields within the dataset, or functional dependencies among values in different fields, may be unknown. Data profiling can involve examining a dataset in order to determine such characteristics. Some techniques for data profiling include receiving information about a data profiling job, running the data profiling job, and then returning a result after a delay that is based on how long it takes to perform various processing steps involved with the data profiling. One of the steps that may involve significant processing time is “canonicalization,” which involves changing the data types of the values appearing within the records of a dataset to a predetermined or “canonical” data type to facilitate additional processing. For example, canonicalization may include converting values to a human readable string representation.