Data quality improvement is achieved through data cleansing which typically has four stages namely, investigate, standardize, de-duplication, and survivorship. In the standardization, stage data is transformed to a standard uniform format. This involves segmenting the data, canonicalization, correcting spelling errors, enrichment, and other cleansing tasks using rule sets. Different rule sets need to be created for data from different domains. However, creation of data standardization rules is an expensive task and can easily take months of effort if not weeks. Thus, there is a need to control the creation of the data standardization rules.