Large databases running in publicly accessible environments are notorious for their inability to accommodate change. In today's world of massive access to large databases via the Internet, it is increasingly common to encounter messages to the effect of “database is down for maintenance” instead of the actual data requested.
In a conventional database environment, when a change needs to be made to the schematic structure of a database, the data in the database must be extracted from the database in the old structure and re-written to the database in the new structure. If new data were to be inserted into the database while said changes were being effected, it could cause unpredictable effects to the database. Such effects could include corruption of pre-existing data, misapplication of database changes, misalignment of data relative to internal data boundaries, or any number of problems that could render the database effectively incoherent. Such results are untenable in most live database deployments.
Conventionally, the most common solution to the problem of updating during changes to the schematic structure is to simply disallow it.
Conventional databases also require a tight bind between the data type and the data storage. Users require that the data they request be presented in a manner consistent with the expected usage of the data. For example, a date may be stored in the database as a string of decimal digits (e.g., 20010303), but to present the data to the user in its raw form would be unacceptable. A conventional computer user requires that it be presented in a manner consistent with its usage (e.g., Saturday, 3, Mar. 2001. In order for the date to be presented in a manner consistent with its usage, the database must carry type-related information along with each unit of data.
Binding between data and type is conventionally accomplished by organizing the data into metaphorical rows and columns. Rows of data are divided into pre-defined columns, where each column represents a particular data type and/or use of the data. Such data/type binding allows a computer program to make assumptions and inferences about the data appropriate to its type. Additional rows of data may be readily added to a database. However, if a new column is desired in a database, then the database must typically be made unavailable for a period of time so that data can be converted into the new format. Modifications to pre-existing programs would have to be made, along with the requisite testing and debugging necessary to validate any new code.
It is also worth noting that in conventional databases there tends to be redundancy in the storage of data. For example, cities, states, zip codes, and telephone area codes may be repeated among a number of rows of data. Such redundancy results in inefficient use of memory.
In light of the foregoing, it is apparent that there is a need for a system and method for modifying the schematic structure of a database without making the database unavailable for the entry of new data. Preferably, such a system and method would, among other things, also minimize redundancy of data in a database.