1. Field of the Invention
The present invention relates generally to computerized data processing systems and methods, and more specifically, to computerized database systems and methods wherein data from object-oriented data models are mapped into relational data models optimized for very large and complex databases. The present invention finds general utility in areas of information storage where the data is complex or rapidly changing. The present invention finds particular utility in the field of processing, storing, and retrieving heath-care related data (and will described in connection with such utility) in and from, respectively, very large database systems (e.g., systems having storage on the order of 1000 gigabytes or more), although other utilities are also contemplated for the present invention, including processing, storage and retrieval of other types of data.
2. Brief Description of Related Prior Art
Conventional Relational and Object Oriented Database Methodologies
Various means and methodologies exist presently for persistent storage of data for use in computer system applications. Known computer database systems and methods support, inter alia, "schema management" for describing properties of, and relationships between, data in the database.
With the evolution of database technology, fundamental functionality as described hereinbefore has been maintained and expanded while data complexity and processing performance requirements have increased. A type of data management technology applied in commercial data processing known as "relational" database technology is modeled such that all data is organized as though it is formatted into tables, with the table columns representing the table's fields or domains and the table rows representing the values of the table's fields or domains. Data is logically organized as tables but is not necessarily physically stored as such. The relational database user does not need to know how the database is physically constructed and can access and update data via a language interface or "structured query language" (SQL). The relational model assumes a certain stability in the number of columns of data associated with a table and that usually, data is present in most, if not all, fields within the table.
The increase in relative complexity of data (associated with, e.g. science and engineering problem solving, healthcare, marketing and sales and the evolution of complex data structures and data entities modeled on real-world objects) led to the development of "object oriented" database techniques. Object entities or "objects," are complex data structures which can model real-world entities or relationships among or between entities, and are associated in classes and identified with their informational features (attributes). Objects are effected using object oriented programming languages such as Prolog, C++ and Smalltalk. Objects are more readily classifiable into types, which are easily related to one another in subtype/supertype hierarchies. Object oriented languages and databases permit the programmer and database designer to flexibly define data types as not to be constrained by limited predefined types. Object oriented language types can be associated in classes which can "inherit" attributes and/or behaviors from other classes.
Health Care-Related Data Processing and Storage
Processing and storage of health care-related data has involved a substantial, and ever increasing amount of complexity. For example, given advances in medical science, at present, more than 15,000 measurable, human physical conditions (hereinafter "symptomatic conditions") are known to exist which are symptomatic of heath care-related facts, conditional hypotheses (diagnoses), problems and/or conditions in human beings, and/or from which patient care regimes may be established. The number and complexity of these symptomatic conditions continue to increase at an incredible rate.
Conventional commercial/industrial and health care database management systems typically are based upon the relational database model. Given the number and complexity of symptomatic conditions, this has meant that if a single relational database table of health care-related data is to be constructed, that database table generally must be made to comprise a separate database table column for each of the symptomatic conditions whose measurements may be recorded in the database; typically, this results in creation of a very large relational database table comprising many thousands of columns--or many tables with fewer columns--leading to an overwhelming complexity in terms of both database design and information retrieval. Given that it is highly unlikely that measurements of more than a few of the many thousands of symptom conditions will be relevant to diagnosis or treatment of a given patient's medical condition at any given time, it is also highly unlikely that more than a few of these symptomatic conditions will be measured at any given clinical observation of that patient. This results in the creation of a so-called "sparse matrix" database table wherein many of the row and column entries in the relational table will be empty (i.e., filled only with "null" data values or in the worst case, filled with spaces). Unfortunately, all relational database systems require at least some finite amount of computer memory or mass-storage space (hereinafter "storage space") to store such null values and some cannot store null values at all, thus requiring massive storage of spaces. Disadvantageously, since the total amount of storage space present in a given database system is fixed, storage of the relatively large number of null values typically present in a very large sparse matrix database table undesirably decreases the amount of storage space available in the database system for storage of useful data. Further disadvantageously, such very large sparse matrix database tables typically exhibit relatively poor (e.g., slow and inefficient) data storage and search performance, and in extreme cases, can actually be inoperative.
As an attempt to solve these problems, it has been proposed to implement health care-related relational databases in the form of separate relational database tables, each of which is for storing and retrieving data solely related to a particular class of data, for example, diagnosis or treatment of particular classes of medical conditions (e.g., brain or neurological conditions) and/or to be used by particular classes of medical specialists (e.g., brain surgeons or neurologists). Unfortunately, although this solution is effective, to some degree, in reducing the number of null data values stored in this database system compared to the aforesaid sparse matrix database system, it also results in creation of a large number of separate relational database tables which must be joined together when it is desired to perform certain operations on the database (e.g., global search operations for occurrences of measurements of a given variable across all of the tables comprising the database system). Given the large number of separate relational database tables in the database, this means that such operations will require execution of a large number of table joins. Disadvantageously, this can cause this type of conventional database system to exhibit relatively poor data storage and search performance or, in the extreme, the inability to complete the joins in at all or in a timely manner. Since time of retrieval can be critical in healthcare, this solution may not be practical.
Furthermore, heretofore, adequate means have not been provided for ensuring standardization in the form and content of clinical observation data (e.g., actual clinical measurements of symptomatic conditions) input into the database, or in the definitions of symptomatic conditions whose observation data is stored in the database. Unfortunately, this can result in substantial ambiguity and uncertainty in the meaning of observation data contained in the database, for example, when many symbols are used to define either observational data, e.g., blood pressure, or diagnoses, e.g., high blood pressure. Disadvantageously, this can substantially reduce the usefulness of such observation data contained in the database since users of the data base may need to query many sysmbols instead of one.
Additionally, since present health providers (e.g., physicians, health care organizations, and related support personnel) are almost universally solely acquainted with health care-related database systems predicated upon the relational data model, significant institutional bias exists in the health care industry toward use of relational database methodologies. Thus, any solution to the aforesaid prior art problems that is predicated upon use of object-oriented methodologies must also make use of relational data methodologies at least to the extent necessary to permit same to be accepted and used in the health care industry.
Examples of prior art data processing systems and methods are disclosed in, e.g., Gerull et al., U.S. Pat. No. 5,426,780; Ryu et al., U.S. Pat. No. 5,513,348; Martel et al., U.S. Pat. No. 5,542,078; Olson et al., U.S. Pat. No. 5,556,333; Doktor, U.S. Pat. No. 5,604,899; Jensen et al., U.S. Pat. No. 5,615,362; and, Doktor, U.S. Pat. No. 5,617,567. Unfortunately, all of these prior art systems and methods suffer from the aforesaid and/or other disadvantages and drawbacks.