Advent of a global communications network such as the Internet has facilitated exchange of enormous amounts of information. Additionally, costs associated with storage and maintenance of such information has declined, resulting in massive data storage structures. Hence, substantial amounts of data can be stored as a data warehouse, which is a database that typically represents business history of an organization. For example, such stored data is employed for analysis in support of business decisions at many levels, from strategic planning to performance evaluation of a discrete organizational unit. Such can further involve taking the data stored in a relational database and processing the data to make it a more effective tool for query and analysis.
Accordingly, it is important to store such data in a manageable manner that facilitates user friendly and quick data searches and retrieval. In general, a common approach is to store electronic data in a database. A database functions as an organized collection of information, wherein data is structured such that a computer program can quickly search and select desired pieces of data, for example. Commonly, data within a database is organized via one or more tables, and the tables are arranged as an array of rows and columns.
Moreover, such tables can comprise a set of records, wherein a record includes a set of fields. Records are commonly indexed as rows within a table and the record fields are typically indexed as columns, such that a row/column pair of indices can reference particular datum within a table. For example, a row can store a complete data record relating to a sales transaction, a person, or a project. Likewise, columns of the table can define discrete portions of the rows that have the same general data format, wherein the columns can define fields of the records.
In general, each individual piece of data, standing alone, is not very informative. Database applications allow the user to compare, sort, order, merge, separate and interconnect the data, so that useful information can be generated from the data. Moreover, capacity and versatility of databases have grown incredibly to allow virtually endless storage capacity by utilizing databases.
In such databases, selecting large number of columns require consuming significant resources on the client and server side of the machine. Representing objects that have large number of properties remain a challenging task. Moreover, there exist a number of customer segments that store heterogeneous, semi structured data in Structured Query Language (SQL) Server tables—wherein such semi-structured data includes groups of scalar, complex and collection properties that can be ordered, open, and heterogeneous.
For example a document/content management system similar to Windows® Sharepoint services, may store different types of user data in a single table. These tables by definition contain data that have different properties that apply to different subsets of rows in the table. In such cases, SQL Server tables contain columns that are populated with values for only a subset of rows in the table—(such as sparse columns with NULL values for most of the rows in the containing table)—though such subsets can vary from column to column. Also as new types of contents are added to the table, there can be employed to add new kinds of properties (columns) that apply to the new content type. Such can further introduce a requirement for frequently changing schema for the table as well as ability to define large number of columns in a table—which further add to the complexities involved.