1. Field of the Invention
The present invention relates to a structured document database having a hierarchical logical structure, and has as its object to provide a storing method and apparatus of structured document data so as to attain a high-speed retrieval process of structured document data.
2. Description of the Related Art
Several methods of a structured data management system that stores and retrieves structured documents described in an Extensible markup language (XML) and the like have been proposed.
(1) A method of storing structured data intact as a text file as a simple method. With this method, when the number of data and the data size increase, the storage efficiency drops, and a retrieval process that exploits the features of structured documents becomes harder to achieve.
(2) A method of managing structured document data in an RDB (Relational Database).
(3) A method of managing structured document data using an OODB (Object Oriented Database) which has been developed to manage the structured document data. Backbone systems prevalently use the RDB, and for example, an XML compatible RDB that expands the RDB is commercially available as a product. Since the RDB stores data in a flat table format, complicated mapping is required to associate the hierarchical structure such as XML data and the like with the table. If prior schema design for this mapping is insufficient, performance drop may occur.
In recent years, a new method has been proposed in addition to these methods (1) to (3).
(4) A method of natively managing structured document data. This method stores XML data having various hierarchical structures without any special mapping process. For this reason, no special overhead is produced upon storage or acquisition. Also, the need for prior schema design that requires high cost can be obviated, and the XML data structure can be freely changed as needed in correspondence with a change in business environment.
Even if structured document data are efficiently stored, they are useless if no means for extracting stored data is available. As such means for extracting stored data, query languages are used. XQuery (XML Query Language) has been designed for XML as in SQL (Structured Query Language) for RDB. XQuery is a language used to handle XML data like a database. For this purpose, means for extracting a data set that matches a condition, and means for compiling and parsing data are provided. Also, since XML data have a hierarchical structure as a combination of parent elements, child elements, sibling elements, and the like, means for tracing such hierarchical structure is provided.
A technique for retrieving structured document data that includes a specific element and specific structure designated by a retrieval condition while tracing the hierarchical structure of the stored structured document data has already been proposed (e.g., Jpn. Pat. Appln. KOKAI Publication Nos. 2001-34618 and 2000-57163).
As the structure of the structured document data has a larger scale, the number of structured document data stored in a database is larger, and a retrieval condition becomes more complicated, a longer time is required to trace elements which form the hierarchical structure of each structured document data. Also, it becomes impossible to expand stored structured document data onto a memory with increasing number of structured document data and their sizes, and most of structured document data are stored in a secondary storage such as a hard disk or the like.
In the method of natively managing structured document data, the hierarchical structure among elements of the structured document data is stored intact. In order to check if an element or structure designated as a retrieval condition is included, elements of structured document data stored on the secondary storage must be frequently accessed. Still more accesses are required for a complicated retrieval condition.
Conventionally, in order to retrieve structured document data having a desired element or structure from a database that stores structured document data with the hierarchical structure, a high-speed retrieval process cannot be attained since structured document data having an element or structure designated by the retrieval condition is retrieved while tracing element data which form the hierarchical structure of each structured document data in the database. Especially, it becomes more difficult to attain a high-speed retrieval process with increasing size of structured document data and increasing number of structured document data to be retrieved under more complicated retrieval conditions.