1. Technical Field
The present invention relates generally to an improved data processing system. In particular, the present invention provides method, system, and computer instructions for defining and enforcing strong-typing among domains in a database management system.
2. Description of Related Art
Data types in various programming languages or database systems divide data into high-level categories, such as, for example, numeric, character string, date-time, and binary string. A database management system, such as a relational data management system (RDBMS) or object-relational database management systems (ORDBMS) (e.g., DB2 or Oracle), is used to organize, store, and retrieve this data by accepting requests for data from an application program and then instructing the operating system to transfer the appropriate data. Database management systems also are used to ensure the security and integrity of data in the database. Current RDBMS or ORDBMS provide built-in data types, such as integer, characters, or timestamp, to allow for dividing data into data types for further granularity. However, these built-in data types are limited to a handful of categories and are fixed for a given RDBMS release.
Due to the inflexibility and limited number of built-in data types in RDBMS, these built-in data types suffer from a number of drawbacks. For instance, these data types do not adequately solve the business problem on differentiating data into data domains (e.g., separate dollars from pounds, miles from kilometers, etc.). A domain is a valid, complete set of values for an entity's attribute (i.e., column). Domains provide the context of business meaning and the usage of data. For example, U.S. dollar with a value of 1 has a different domain than U.K. pounds with a value of 1, even though both have an integer value of 1. Thus, dollars and pounds belong to different data domains, which have their own business definitions. The built-in data types also cause data ambiguity, as both dollars and pounds are represented as numbers even though they are different monetary units.
Another drawback to the RDBMS and ORDBMS built-in data types is that use of the built-in data types may compromise data integrity and quality if there are no other reliable means to enforce strong-typing among domains. Strong-typing is a process that guarantees that functions and operations can only work on compatible data types or domains. For example, the process that prohibits direct comparison between U.S. dollars and U.K. pounds is strong-typing. However, strong-typing cannot be enforced within similar built-in data types in database systems. Thus, even though the direct operations between dollars and pounds in the above example are not meaningful, database systems cannot prohibit the direct arithmetic operations between these two domain types if they both are defined as numeric data types. Consequently, developers have no way to identify if they have made errors, and database administrators have no way to know whether the data is clean.
In contrast to those built-in data types shipped with RDBMS, user-defined types (UDTs), or abstract data types (ADTs), are data types that are defined by users. User-defined functions (UDFs) are functions are defined by users, in contrast to those built-in functions shipped with RDBMS. UDTs in conjunction of UDFs in certain object-relational databases provide users with more options to granulate the data types into data domains and enforce strong-typing among domains.
Although UDTs and UDFs provide further granularity of data types and enforce strong-typing among domains, these UDTs and UDFs also suffer from numerous shortcomings. Creation and alteration of UDTs and UDFs are cumbersome and inflexible since creation of UDFs is required for domain operations. This inflexibility creates problems when business rules behind the domains change constantly and require on-demand domain creation. In addition, the built-in functions in RDBMS cannot be used on UDTs directly. For example, plus or minus operations must be defined for UDTs. Although the restrictions on usage of functions are sometimes needed, the effort of re-creating functions similar to built-in functions are duplicative and do not take advantage of matured RDBMS capabilities.
Furthermore, the programming interface to use UDTs is cumbersome, as developers are required to memorize allowable UDFs and cast UDTs during SQL coding. UDTs and UDFs add a heavy burden to deployment and administration of applications and RDBMS.
Although applications may also be used to enforce domain strong-typing and valid business rules in many situations, these domain enforcement methods also have severe limitations. Domain strong-typing requires a great deal of programming and coordination effort, especially when numerous applications and large number of developers are involved. The strong-typing and business rules enforcement can become inconsistent depending on how well developers know the rules. With regard to change management, this enforcement is also very difficult since business logic is spread in various places in different applications. In addition, a distributed approach on enforcing domain strong-typing does not allow asset reuse. For example, a reporting system cannot take advantage of existing application's logic checking and enforcing.
Therefore, it would be advantageous to have a centralized method for providing proper domain support in RDBMS or ORDBMS to allow flexible, easy, and quick domain creation, as well as facilitating easy implementation and change control on domains.