The present invention relates to techniques for using extensible Markup Language (XML) data in a relational database system.
The World Wide Web (WWW) involves a network of servers on the Internet, each of which is associated with one or more Hypertext Markup Language (HTML) pages. The HTML pages are transferred between clients that make requests of servers and the servers using the Hypertext Transfer Protocol (HTTP). Resources available from servers on the Internet are located using a Universal Resource Locator (URL). The standards and protocols of the WWW are promulgated by the World Wide Web Consortium (W3C) through its servers at www.w3c.org, and are used on many private networks in addition to their use on the Internet.
The HTML standard is one application of a more general markup language, standard called the Standard Generalized Markup Language (SGML). Recently, a subset of SGML that is more powerful and flexible than HTML has been defined and has gained popularity for transferring information over the Internet and other networks. The new standard, developed and promoted by W3C, is called the extensible Markup Language (XML). XML provides a common syntax for expressing structure in data. Structured data refers to data that is tagged for its content, meaning, or use. XML provides an expansion of the tagging that is done in HTML, which focuses on format or presentation. XML tags identify XML elements and attributes of XML elements. XML elements can be nested to form hierarchies of elements. As used hereinafter, the terms xe2x80x9celementxe2x80x9d and xe2x80x9cattributexe2x80x9d retain their general meaning and are not limited to XML elements and XML attributes, unless otherwise clear from the context.
A set of syntax rules for XML elements shared by multiple XML documents is defined by an XML schema, itself an XML document. For example, the syntax rules indicate what elements can be used in a document, in what order they should appear, which elements can appear inside other elements, which elements have attributes, what those attributes are, and any restrictions on the type of data or number of occurrences of an element. XML allows documents to contain elements from several distinct XML schema by the use of namespaces. In particular, elements from other, independently created XML schema can be interleaved in one XML document.
Given the elements defined and used by XML, a document object model (DOM) is a tree structure formed to define how the information in a particular XML document is arranged. The DOM is navigated using an XPath expression that indicates a particular node or content in the hierarchy of elements and attributes in an XML document. XPath is a standard promulgated by W3C.
Relational databases predate, and developed independently of, the World Wide Web. Relational databases store data in various types of data containers that correspond to logical relationships within the data. As a consequence, relational databases support powerful search and update capabilities. Relational databases typically store data in tables of rows and columns where the values in all the columns of one row are related. For example, the values in one row of an employee table describe attributes of the same employee, such as her name, social security number, address, salary, telephone number and other information. Each attribute is stored in a different column. Some attributes, called collections, can have multiple entries. For example, the employee may be allowed to have multiple telephone numbers. Special structures are defined in some relational databases to store collections.
A relational database management system (DBMS) is a system that stores and retrieves data in a relational database. The relational DBMS processes requests to perform database functions such as creating and deleting tables, adding and deleting data in tables, and retrieving data from the tables in the database. A well-known standard language for expressing the database requests is the Structured Query Language (SQL).
Object-relational databases extend the power of relational databases. Object-relational databases allow the value in a column to be an object, which may include multiple other attributes. For example, the value in the address column may be an address object that itself has multiple attributes, such as a street address, a city, a state, a country, and a zip code or equivalent. An object type (also called an abstract data type ADT) defines the attributes of an object in an object relational database. SQL has been extended to allow the definition and use of objects and object types in object-relational databases. As used hereinafter, the term xe2x80x9cobject-relational databasexe2x80x9d refers to a subset of relational databases that support object-relational constructs; and an object-relational construct is one example of a relational construct. The term xe2x80x9cSQL constructxe2x80x9d is used hereinafter to refer to relational constructs, such as tables, columns, and rows, and object-relational constructs such as ADT columns and tables and collections.
Because of the popularity of XML as a data exchange format that supports hierarchical relationships among XML elements, and because of the power of relational DBMSs to update and retrieve data, there is a demand for generating XML data output from relational databases and storing XML data into relational databases. In one approach, a database administrator can commission programming efforts to generate code in a procedural language that maps data in particular XML constructs to data in particular relational database constructs and back. Such programming efforts can be expensive.
In another approach, declarative statements, similar to SQL statements, can be employed to simply express the relationship between XKNL constructs and SQL constructs. General routines that convert the data according to declared relationships are written one time by a DBMS vendor and supplied to a database administrator. This saves the database administrator from developing procedural language programs to convert the data. To support this demand, an industry standard SQL to operate on XML documents has been developed. This standard is called SQL/XML and documents relating to SQL/XML are available at the time of this writing at www.sqlx.org. SQL/XML provides declarative statements that can be used to simply express some conversions between hierarchical XML constructs and SQL constructs. For example XMLAgg is a SQL/XML function that generates one XML construct from a set of XML elements generated from selected rows of a relational table. For convenience, hereinafter data that is used for an XML document or fragment thereof is called xe2x80x9cXML data,xe2x80x9d even if it is stored in SQL constructs.
While SQL/XML statements provide powerful tools for many circumstances that arise in converting between XML constructs and SQL constructs, they do not simply accommodate all circumstances that arise. For example, conventional SQL/XML statements do not support modifications to an XML document stored in the SQL DBMS. An XML document is ingested whole or is output whole by the SQL DBMS. A user of the DBMS can make modifications to the contents of the SQL constructs only if the user knows the SQL constructs in sufficient detail. However, a user who knows more readily the XML constructs (e.g., the XML document, XML elements, XML attributes, and fragments of the XML document), cannot use declarative statements that refer to those constructs to modify the document in the DBMS using conventional SQL/XML commands. Such a user might generate the whole XML document from the database, update the document with an XML editor that works on the whole XML document, and then store the revised whole XML document back into the database managed by the SQL compliant DBMS, utilizing DBMS capability to generate needed SQL constructs for the revised XML document.
Based on the foregoing, there is a clear need for SQL compliant declarative statements that allow a user to express changes to the content of an XML construct managed in an SQL compliant DBMS in terms of the XML constructs.
One approach an SQL compliant DBMS can follow is to allow a user to declaratively specify a change to an XML construct in an XML document, and then to have the DBMS temporarily and internally generate the whole XML document from the database, update the document with an XML editor that works on the whole XML document, and then store the revised whole XML document back into the database, generating SQL constructs as needed to hold the new XML constructs. This approach is useful, for example, when the whole document is stored as a single large object (LOB), which is one SQL construct. However, if different XML constructs are stored in different SQL constructs, this approach involves generating XML data from multiple SQL constructs, editing the XML document, and then forming or filling again every SQL construct used to store XML data for the revised XML document. If the contents of some SQL constructs have not changed, computational resources consumed, in outputting data to the temporary XML document from such unchanged constructs and then storing the same data back into the same SQL construct, are wasted.
Based on the foregoing, there is a clear need for evaluating declarative statements that specify changes to content of an XML document without modifying SQL constructs that are not affected by the changes.
The approaches described in this section could be pursued, but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, the approaches described in this section are not to be considered prior art to that claims in this application merely due to the presence of these approaches in this background section.