Various methods of a structured data management system that stores and retrieves structured documents described in an Extensible markup language (XML) and the like have been proposed.
(1) A method of storing structured data intact as a text file as a simple method. With this method, when the number of data and the data size increase, the storage efficiency impairs, and a retrieval process that exploits the features of structured documents becomes harder to achieve.
(2) A method of managing structured document data in an RDB (Relational Database).
(3) A method of managing structured document data using an OODB (Object Oriented Database) which has been developed to manage the structured document data. Backbone systems prevalently use the RDB, and for example, an XML compatible RDB that expands the RDB is commercially available as a product. Since the RDB stores data in a flat table format, complicated mapping is required to determine correspondence between the hierarchical structure such as XML data and the like, and the table. If prior schema design for this mapping is insufficient, performance drop may occur.
In recent years, a new method has been proposed in addition to these methods (1) to (3).
(4) A method of natively managing structured document data. This method stores XML data having various hierarchical structures without any special mapping process. For this reason, no special overhead is required upon storage or acquisition. Also, the need for prior schema design that requires high cost can be obviated, and the XML data structure can be freely changed as needed in correspondence with a change in business environment.
Even if structured document data are efficiently stored, there is no point if no means for extracting stored data is available. As such means for extracting stored data, query languages are used. XQuery (XML Query Language) has been designed for XML as in SQL (Structured Query Language) for RDB. XQuery is a language used to handle XML data like a database. For this purpose, means for extracting a data set that matches a condition, and means for compiling and parsing data are provided. Also, since XML data have a hierarchical structure as a combination of parent elements, child elements, brother elements, and the like, means for tracing such hierarchical structure is provided.
A technique for retrieving structured document data that includes a specific element and specific structure designated by a retrieval condition while tracing the hierarchical structure of the stored structured document data has already been proposed (e.g., Jpn. Pat. Appln. KOKAI Publication Nos. 2002-34618 and 2000-57163).
As the structure of the structured document data has a larger scale, the number of structured document data stored in a database is larger, and a retrieval condition becomes more complicated, a longer time is required to trace elements which form the hierarchical structure of each structured document data. Also, it is impossible to expand stored structured document data onto a memory with increasing number of structured document data and their sizes, and most of structured document data are stored in a secondary storage such as a hard disk or the like.
In the method of natively managing structured document data, the hierarchical structure among elements of the structured document data is stored intact. In order to check if an element or structure designated as a retrieval condition is included, elements of structured document data stored on the secondary storage must be frequently accessed. Still more accesses are required for a complicated retrieval condition.
Conventionally, in order to retrieve structured document data having a desired element or structure from a database that stores structured document data with the hierarchical structure, a high-speed retrieval process cannot be attained since structured document data having an element or structure designated by the retrieval condition is retrieved while tracing element which form the hierarchical structure of each structured document data in the database. Especially, it becomes more difficult to attain a high-speed retrieval process with increasing size of structured document data and increasing number of structured document data to be retrieved.