1. Field of the Invention
Data storage systems, database management systems and associated user interfaces.
2. The Prior Art
One-Dimensional Model of Data Storage
The implementation of any system for storing, accessing and manipulating data, such as a database system, must store data in memory relative to some model of data storage. Data storage is typically modeled as one-dimensional, since memory itself is accessed via one-dimensional addresses (that is, the address of a memory location is a single integer). This includes relational data storage systems, which store data organized as tables 38 containing a series of records 40, where each record 40 contains data values 42 corresponding to a series of fields 44, as illustrated in FIG. 1, which shows a simple table 38. FIG. 2a illustrates how data values 42 may be contiguously stored, relative to a one-dimensional data storage model, where each data value 42 is stored at a distinct memory location identified by a one-dimensional address 46. Even though a relational database table 38 is logically viewed as two-dimensional, containing a matrix of data values 42 organized into fields 44 and records 40, it is typically implemented using contiguous one-dimensional data storage. Such relational data storage systems are based on the relational data model introduced by Codd, E. F. (June 1970) “A Relational Model of Data for Large Shared Data Banks”. Communications of the ACM 13 (6): 377-387 and elaborated by Date, C. J. (2003) Introduction to Database Systems, 8th edition, Addison-Wesley, ISBN 0-321-19784-4.
For example, FIG. 1 shows a simple table 38 containing three records 40 each containing data values 42 corresponding two fields 44. FIG. 2a illustrates how data values 42 may be contiguously stored, relative to a one-dimensional data storage model, where each data value 42 is stored at a distinct memory location identified by a one-dimensional address 46. In contrast, FIG. 3a illustrates how these data values 42 may be stored non-contiguously (where the storage of other data values 42 may be interleaved with the six data values 42 shown in the table 38 in FIG. 1).
Data Accessed Via Indexes Containing Physical Addresses
Data typically is accessed via indexes. An index is essentially a lookup table that maps the data values 42 corresponding to a field 44 (or a combination of fields 44) to the addresses 46 of those data values 42, as illustrated in FIGS. 2b and 2c, which show indexes for fields 44 Name and Age (see the table 38 shown in FIG. 1), respectively. FIGS. 3b and 3c show similar indexes for data values 42 stored noncontiguously, as illustrated in FIG. 3a. Using indexes to lookup data values 42 is far more efficient than performing a full table scan by sequentially searching a table's records 40. Indexes are typically implemented as a variant of b-trees (or some other data structure that organizes a set of values to facilitate rapid search).
Accessing data via indexes suffers from a fundamental limitation: data values 42 are accessed via their physical addresses 46, resulting in the following drawbacks:                Index Addresses May Require Adjustment: the address 46 contained in the index must be updated if data values 42 are relocated—since accessing data values 42 via indexes requires that accurate physical addresses 46 be contained in the index.        Replicated Data May Require Index Adjustment: since indexes replicate data values 42, indexes must be adjusted when these data values 42 are modified in order to keep the index consistent with the latest modifications to these data values 42.        Data Contiguity Is Required to Support Bulk Data Access Capability: for performance reasons, the succession of data values 42 within a record 40 is typically stored contiguously in physical storage to facilitate retrieval of a record 40 in a single operation. Similarly, records 40 within a table 38 are typically stored contiguously in physical storage to facilitate “bulk” sequential access to the records 40 comprising a table 38. Such bulk sequential access methods enable large amounts of data values 42 to be obtained in a single (or few) data access operations. This de facto data contiguity requirement means that data cannot be relocated (unless the entire table is relocated, such that contiguity is preserved) without undermining the hulk access capability.        Fixed-Size Memory Regions and Fragmentation: a new table 38 is allocated a contiguous, fixed-size region, which may result in fragmentation if this region is under-utilized (if the table is sparsely populated).        “Flat” Data With No Intrinsic Meaning: Data values 42 have no intrinsic meaning (other than their values), and a table schema is required to interpret/relate data values 42 in a context. For example, a series of data values 42 may comprise a record 40 in a table but this is only evident in the context of the table's schema—nothing intrinsic to the data values 42 themselves indicate membership in the record 40. Similarly, nothing intrinsic to data values 42 themselves indicates that a set of data values 42 all correspond to the same field 44. The schema functions as the lens through which data values 42 can be “related”.        
Queries and Joins
Two key capabilities of relational databases are executing a query on a table 38 and joining tables 38 on a common field 44 (or fields 44) to generate a new table 38 containing data values 42 copied from the original tables 38. Various relational querying mechanisms have been developed, such as the popular SQL (Chamberlin, Donald D.; Boyce, Raymond F. (1974), “SEQUEL: A Structured English Query Language”, Proceedings of the 1974 ACM SIGFIDET Workshop on Data Description, Access and Control: pp. 249-264). Joining tables 38 is typically a computationally expensive operation, and many strategies for optimizing join operations have been developed.
Table Reconfiguration
It is not uncommon for the structure, or schema of a table 38 to change after its initial definition. Time-consuming reconfiguration is often required when fields 44 are inserted into or deleted from a table 38, primarily due to the need to maintain contiguity. This typically involves exporting the data values 42 of the modified tables 38 (often as ASCII files with separators), converting from the old table schema to the new schema, and then loading the exported data values 42 relative to the new schema. With conventional data storage systems, contiguity of successive data values 42 within a record 40 is often required, which may necessitate re-assigning the locations of all data values 42 contained in the modified tables. This re-assignment of locations would of course not be required if data were permitted to be stored noncontiguously (as illustrated in FIG. 3a).
Spreadsheets and Databases
In the prior art, when a spreadsheet application requires database functionality, it must interact with a separate database application. This has the following drawbacks:                data must be communicated between the spreadsheet application and database application, possibly resulting in processing delays        since the representation of data inter-relationships is contained within the database application, the spreadsheet application itself (and associated user interface) cannot exploit knowledge of these data inter-relationships        