A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever. The following notice applies to the software and data as described below and in the drawing hereto: Copyright(copyright) 1998, Microsoft Corporation, All Rights Reserved.
This invention relates generally to database applications, and more particularly to internally validating databases.
Relational databases are used by many of the software applications currently available. An important aspect in the use of these databases is ensuring that the database is valid and contains high quality data.
Database engines typically have mechanisms designed to maintain the validity of the database. One way a typical database engine maintains validity is to ensure that the data entered for a column meets the criteria defined by the data type of the column. For example, a database engine will reject a value containing alphabetic characters for a column defined as containing integer.
Another way a database commonly maintains validity is by maintaining what is known in the art as referential integrity. Referential integrity is maintained by ensuring that rows in a target table identified by foreign key values in a source table actually exist. In other words, a source table is not allowed to have foreign key values that refer to a nonexistent row in a target table.
While the mechanisms discussed above are a step toward maintaining quality data in the database, several problems remain. First, it is commonly the case that the data in the database must meet application-specific criteria. For example, certain columns typically must contain values that are within a minimum and maximum range relevant to the software application. While the data type specified for a column will generally have a minimum and maximum value, this value is typically determined by the data structure used to represent the value, and is the same for all applications using the database engine. There is typically no way to specify an application-specific range for a column.
Second, it is common for the valid values for a column to belong to a relatively limited set, or for the values to be excluded from a set. There is generally no way for a database engine to validate a column against a set of values.
Third, many times a column contains data having a specific syntax, such as a file name, version number or system identifier. Because the column must be specified using one of the native data types supported by the database engine, there is no way for the engine to provide syntax-specific validation.
Fourth, referential integrity checks are typically performed when a row is being inserted, updated or deleted. The foreign keys in the source row must refer to a currently existing target row in the database. However, the check is made when the database operation is performed. Typically, there is no way to validate the row prior to the operation to give the user a chance to correct the data.
Finally, the validation performed by a database engine occurs when the user attempts to insert a row into the database. The row as a whole must be valid; there is no way for the user to find and correct individual data values for columns having an invalid value before inserting or updating the row.
Thus there is a need for a database validation system that provides for a more flexible range checking mechanism and that provides for validating a value against a set of values. The system should also provide for validating columns containing data having a common syntax, such as a file name, version number, or system identifier. In addition, the system should be able to validate individual columns of a candidate row before the row is inserted, updated, or deleted in the database. Finally, the system should provide for easy modification of the validation criteria to accommodate changes to table definitions and modifications to the validation criteria itself.
The above-identified problems, shortcomings and disadvantages with the prior art, as well as other problems, shortcoming and disadvantages, are solved by the present invention, which will be understood by reading and studying the specification and the drawings. One aspect of the invention is a system that comprises a database having a data table and a validation table. The data table has at least one data column and at least one data row and the validation table has a plurality of validation columns and a validation row. Also included is a database engine module for maintaining the database and a database validation module operative to read a validation value from the validation column of the validation row and using the validation value to validate a data value in the data column of the data row.
A further aspect of the invention is a validation table that contains columns used to provide validation constraints applied to column data contained in rows of data tables in the database. The constraints include range bounds, set membership, and foreign key relationship checks. The tables and columns to be validated are specified within the validation table.
As part of the invention, validation methods are executed by a validation module contained within the database engine maintaining the database. In an alternative embodiment, the validation module is a separate component.
The invention provides advantages not found in prior systems. Validation parameters are easy to maintain because they reside in a table that can be edited. Changes and additions to the validation parameters do not require altering the database table definition or other database system parameters, thereby avoiding the possibility of having to rebuild and repopulate the entire database. Validation parameters can be added easily as new tables and columns are added to the database.
The methods of the invention allow for checking individual columns within a row, the whole row, and the whole row against the database. Also, the methods associated with the invention can check for referential integrity prior to the insertion, modification or deletion of a row.
The invention includes systems, methods, computers, and computer-readable media of varying scope. Besides the embodiments, advantages and aspects of the invention described here, the invention also includes other embodiments, advantages and aspects, as will become apparent by reading and studying the drawings and the following description.