Storage of information in a storage medium may be facilitated using a database in conjunction with a database management system (DBMS). A database is a collection of related data that may be stored on a nonvolatile memory medium. Data in the database is commonly organized in a two-dimensional row and column form called a table. A database typically includes multiple tables.
A table is an object in the database having at least one record and at least one field within each record. Thus, a table may be thought of as an object having two-dimensional record and field organization. A record is a row of data in the table that is identified by a unique numeric called a record number. A field is a subdivision of a record to the extent that a column of data in the table represents the same field for each record in the table. Each field in a record is identified by a unique field name and a field name remains the same for the same field in each record of the table. Therefore, a specific datum in a table is referenced by identifying a record number and a field name.
A database management system (DBMS) is a control system that supports database features including, but not limited to, storing data on a memory medium, and retrieving data from the memory medium. Data in the database is typically organized among a plurality of objects that include, but are not limited to, tables and queries. An individual table or query may be referred to as a record source because it is a source of data or records from the database. A query object is an executable database interrogation statement, command, and/or instruction that communicates to the database management system the identity and location of data being extracted from the database. The product of an executed query is called a result set. A result set may be stored and/or manipulated as a two-dimensional object similar to the table discussed previously.
Conventionally, one of the prevalent forms of data organization is a relational database. A relational database can be managed by a database management system and/or a managed provider. Data in a relational database is distributed among multiple record sources that are typically related, or normalized, in a manner designed to minimize redundant data in the database, minimize the space required to store data in the database, and maximize data accessibility. Record sources in a database may be related to one another via key fields. A normalized database is one where each record source in the database is directly related to at least one other record source in the same database by key fields.
A key field can be a primary key or a foreign key. A primary key is a field or combination of fields in a record source that includes unique data for each record in the table. A foreign key is any non-primary key in a record source that is the basis for a direct relation with any other record source. A database remains a relational database regardless of the degree of normalization that exists. Record sources in a normalized relational database are typically related. However, a relational database may be normalized even if the database is disconnected in that at least one record source in the database is not related to any other record source by a key field.
Relationships between any two record sources in a relational database may be either direct or indirect. Such a relationship may also be referred to as a relation or join. A direct relationship exists between two record sources if there is no intervening record source in the relationship path there between. An indirect relationship exists if there is at least one intervening record source in the relationship path between two record sources.
The record sources in a relational database and the relationships there between define the geography of a database, which may be called a database schema. A sub-schema of the database is any subset of the full database schema that is defined by a query, a result set of a query, or any other subset of record sources from the database. A database schema and database sub-schema may be displayed visually in graphic form as a graph having edges or arrows representing relationships between record sources, and vertices, also known as nodes or tables, representing the record sources at either end of a relationship.
Queries are used to access data in a database. A query may be constructed in a Structured Query Language (SQL) that may or may not be based on the American National Standards Institute (ANSI) standard SQL definition. To access data in a database, a user may construct a query using an SQL. Executing a query is called a join or joining wherein each relation identified in the query is joined during execution to retrieve the desired data from a database.
However, for many applications, the limitations of the relational database (e.g., homogeneity of records in a table, homogeneity of relationship between parent(s) and children) have conventionally proven difficult to overcome. For example, there can be fields for one customer in the relational database that don't exist for another customer.
Markup languages (e.g., Hypertext Markup Language (“HTML”), Standard Generalized Markup Language (“SGML”) and the Extensible Markup Language (“XML”)) contain text and a number of tags which provide instructions as to how the text should be displayed, which text should be hyperlinked to other documents, and where other types of content, including graphics and other images, video and audio segments, application programs or applets, image maps, and icons, that should be retrieved from and displayed in the document. Some document languages such as SGML and XML can represent documents as trees with each node of the tree labeled with a tag and each node's immediate descendants taking in order having tags that satisfy a production corresponding to the parent's tag. Therefore, a document can be represented as a complete parse tree satisfying the production rules of a grammar. XML was created by the World Wide Web Consortium to overcome the shortcomings of HTML. XML allows a document developer to create tags that describe the data and create a rule set referred to as a Document Type Definition (DTD) to apply to the data rules to the data. Several XML parsers have evolved that can read, decode and validate the text based document extracting the data elements in a platform independent way so that applications can access the data objects according to another standard referred to as the Document Object Model (DOM). DOM is an application program interface (API) that defines a standard for developer interaction with XML data structured tree elements. Therefore, XML document and DOM or XML DOM provides developers with programmatic control of XML document content, structure, and formats by employing script, Visual Basic, C++ and other programming languages. In the most abstract form, data stored in an XML document(s) can be completely unstructured.
With the historical use of relational databases and increasing use of mark-up languages (e.g., XML), there is an unmet need in the art for a unified framework for accessing data (e.g., XML document(s) and/or relational database document(s)).