1. Field of the Invention
The present invention relates to the art of information processing. It finds particular application in high availability database systems employing range tree indexing, and will be described with particular reference thereto. However, the present invention is useful in other information storage environments that employ hot backup systems and user-defined indexing.
2. Description of Related Art
Database environments for businesses and other enterprises should have certain characteristics, including high reliability, robustness in the event of a failure, and fast and efficient search capabilities. High reliability includes ensuring that each transaction is entered into the database system. Robustness includes ensuring that the database is fault-tolerant, that is, resistant to hardware, software, and network failures. High reliability and robustness are important in many business settings where lost transactions or an extended server downtime can be a severe hardship, and can result in lost sales, improperly tracked or lost inventories, missed product deliveries, and the like.
To provide high reliability and robustness in the event of a database server failure, high availability data replicators are advantageously employed. These data replicators maintain a “hot backup” server having a duplicate copy of the database that is synchronized with the primary database deployed on a primary server. The primary server is ordinarily accessed by database users for full read/write access. Preferably, the secondary server handles some read-only database requests to help balance the user load between the primary and secondary servers. Database synchronization is maintained by transferring database log entries from the primary server to the secondary server. The transferred database logs are replayed on the secondary server to duplicate the corresponding transactions in the duplicate copy of the database. With such a data replicator, a failure of the primary server does not result in failure of the database system; rather, in the event of a primary server failure the secondary server takes over as a an interim primary server until the failure can be diagnosed and resolved. The secondary server can provide users with read-only access or with full read-write access to the database system during the interim.
Advantageously, high availability data replicators provide substantially instantaneous fail-over recovery for substantially any failure mode, including failure of the database storage medium or media, catastrophic failure of the primary server computer, loss of primary server network connectivity, extended network lag times, and the like. The secondary server is optionally geographically located remotely from the primary server, for example in another state or another country. Geographical remoteness ensures substantially instantaneous fail-over recovery even in the event that the primary server is destroyed by an earthquake, flood, or other regional catastrophe. As an added advantage, the secondary server can be configured to handle some read-only user requests when both primary and secondary servers are operating normally, thus balancing user load between the primary and secondary servers.
A problem can arise, however, in that high availability data replication is not compatible with certain database features that do not produce database log entries. For example, a range tree index (also known in the art as an R-tree index) includes user-defined data types and user-defined support and strategy functions. Employing an R-tree index or other type of user-defined index system substantially improves the simplicity and speed of database queries for certain types of queries. An R-tree index, for example, classifies multi-dimensional database contents into hierarchical nested multi-dimensional range levels based on user-defined data types and user-defined routines. A database query accessing the R-tree index is readily restricted to one or a few range levels based on dimensional characteristics of parameters of the database query. The reduced scope of data processed by the query improves speed and efficiency. Advantageously, the R-tree index is dynamic, with the user-defined routines re-classifying database contents into updated hierarchical nested multi-dimensional range levels responsive to changes in database contents.
The operations involved in creating the user defined routines defining the R-tree typically do not generate corresponding database log entries. As a result, heretofore R-tree indexes and other user-defined indexes have been incompatible with high availability data replication. Creation of the R-tree index user-defined routines occurs outside the database system and does not result in generation of corresponding database log entries. Hence, the R-tree index is not transferred to the duplicate database on the secondary server during log-based data replication, and subsequent database log entries corresponding to queries which access the R-tree index are not properly replayed on the secondary server.
One way to address this problem would be to construct the R-tree index entirely using database operations which create corresponding database log entries. However, constructing the user-defined routines within the strictures of logged database operations would substantially restrict flexibility of user-defined routines defining the R-tree index system, and may in fact be unachievable in certain database environments.
In another approach to overcoming this problem, identical copies of the user-defined routines defining the R-tree index are separately installed on the primary and secondary servers prior to initiating database operations. This solution has certain logistical and practical difficulties. The user-defined routines should be installed identically on the primary and secondary servers to ensure reliable and robust backup of database operations which invoke the R-tree index. Because the primary and secondary servers may be located in different cities, in different states, or even in different countries, ensuring identical installation of every user-defined routine of the R-tree on the two servers can be difficult. In the event of a fail-over, it may be necessary to repeat the installation of the user-defined routines on the failed server, further increasing downtime.
The present invention contemplates an improved method and apparatus which overcomes these limitations and others.