The present invention relates to SQL databases, and more particularly to a method and system for describing hierarchical XML content and for translating the XML content into SQL.
Almost all databases in use today are based on a xe2x80x9crelational modelxe2x80x9d in which the database data is perceived by its users as a collection of tables. The tables in a relational database include a row of column names specifying one or more column fields, and zero or more data rows containing one value for each of the column fields. Each column in a database stores data regarding a particular concept or object. A database schema defines the tables and the type of data that each table can contain.
All requests to access the database data are usually handled by database system software referred to as a database management system (DBMS). The DBMS supports database operations for interrogating and processing the data in the database, such as adding tables to the database, inserting data into existing tables, updating tables, deleting tables, and retrieving data from existing tables. Such commands include SELECT, INSERT, UPDATE and DELETE, which are examples of statements from the standard database language known as SQL (Structured Query Language). SQL commands can be used to interactively work with a database by a user, or can be embedded within a programming language to interface to the database.
An application program may be written to access the database through the DBMS. Before the application program may be used, however, the database tables must first be created and populated with data during an initialization process.
One approach to populating the database with data is to have a programmer generate a SQL file containing SQL statements necessary to create the relational database base and to populate the database with the data. This approach, however, is tedious and error prone because should the database schema be changed in any way, the programmer must update the SQL file accordingly. In addition, records in a relational database are provided with unique indexes, which have to be manually assigned. If an index is changed, then the programmer has to make sure that all the other tables referring to the index are also updated. This process is unscalable because as the number of tables in the database grows and/or the number of databases supported by the programmer increases, the process becomes unmanageable.
Another approach that may be used to populate the database with data is to hard code a software program with information necessary to generate a file containing the necessary SQL statements. This approach, however, shares the disadvantages with the prior approach in that whenever the structure of the database needs to be changed, the program that generates the SQL must also be changed.
A further problem is that not all SQL languages are the same. The American National Standards Institute (ANSI) has standardized the SQL language, but it does not cover all the bases. The major database management systems supporting SQL, including DB2, SQL/DS, Oracle, Sybase, SQLbase, INFORMIX and CA-OpenIngres (Ingres), each has its own enhancements, quirks and tricks that, for all intents and purposes, makes SQL non-standard. Therefore, moving the program that creates SQL code from one vendor""s SQL database to another would require rewriting portions of the program to convert the SQL statements. As stated above, this type of maintenance is error-prone and unusable.
Accordingly, what is needed is an improved method for generating SQL statements for generating and populating vendor-independent SQL databases. The present invention addresses such a need.
The present invention provides a method and system for specifying sub-elements and attributes in a sub-tree and for extracting data values. The method and system include defining content elements as recursively hierarchical sub-trees by defining a first XML ELEMENT in a document type definition, where each content element includes a top-level element and zero or more embedded subelements, and wherein leaf nodes in the sub-trees represent data values. The method and system further include extracting data values from the content elements by using the description of the first XML ELEMENT to recursively traverse each top-level element and extract the data values from the sub-tree.
According to the method and system disclosed herein, a generic solution is for automatically generating SQL is provided that may be used with different client data and with different vendor SQL databases.