This description relates to specifying and applying rules to data.
Many modern applications, including business applications, process large sets of data (i.e., “datasets”), which can be compiled from various sources. The various sources that provide data to the dataset can have different levels of data quality. To ensure that the applications function properly, an adequate level of data quality in the dataset should be monitored and/or maintained. To monitor or maintain an adequate level of data quality, the dataset can be processed by a data validation system. Such a system applies validation rules to the dataset before it is provided to the application. In some examples, the data validation system uses the results of validation rules to calculate a measure of data quality and alert an administrator of the application if the measure of data quality falls below a predetermined threshold. In other examples, the data validation system includes modules for handling data that fails one or more of the validation rules. For example, the data validation system can discard or repair data that fails one or more of the validation rules.
In general, the validation rules applied by the data validation system are defined by an user or administrator of the data validation system.