1. Field of the Invention
The present invention relates to the field of computer programming, and more particularly to a method, system, and computer program product for using object-level cursors to improve the efficiency of systems which map relational database retrieval results to objects.
2. Description of the Related Art
Relational database operations return results formatted as rows, where these rows correspond to fields from one or more tables of the underlying data model, as is well known in the art. In some software applications, the number of rows that will be returned from a retrieval operation (where these returned rows are referred to equivalently as a xe2x80x9cresult setxe2x80x9d) can be very large. For example, in a batch-oriented tax accounting application the number of rows returned for a query involving taxpayers could be hundreds of thousands or even millions. All processing systems have a memory limit, and thus it may be impossible to retrieve very large result sets into memory at one time for processing.
Prior art relational database systems have provided xe2x80x9ccursorsxe2x80x9d to deal with this problem. The database system logically divides a result set into relatively small subsets, of small enough size to be read into the application""s memory for processing. The system then maintains a database cursor as a logical pointer into the result set, where the cursor is moved from one subset of rows to another to keep track of which rows the application has already processed and which have not yet been retrieved for processing. This processing of subsets and advancing of the cursor is repeated until the entire result set has been processed. When the cursor is advanced, the memory that was used to hold the previous subset of data is released and thus is again available to the system. With this type of approach, a database retrieval result set can be arbitrarily large since there are no memory-related processing limitations.
Today, systems are often developed using an object-oriented language such as Java to build the application logic and a relational database to provide back-end storage for data. (xe2x80x9cJavaxe2x80x9d is a trademark of Sun Microsystems, Inc.) Rather than dealing directly with the relational database, there is an increasing trend for applications to employ a mapping layer to translate between the object orientation of the application and the relational orientation of the database. When performing a retrieval operation, the mapping layer typically translates the relational rows retrieved by the database engine into objects, and returns these objects for processing by the application. Conversely, when the data store is to be updated as a result of the application""s processing, the mapping layer typically translates objects sent by the application into relational row format for storing by the database engine. An example configuration of the components in a system employing this type of mapping function is shown in FIG. 1.
A common approach used by mapping layer software of this type for managing the life cycle of a created object is to bind the object""s life cycle to the underlying database transaction. That is, once the mapping layer creates an object, the object will not be destroyed until the database transaction has completed. When the result set is large, this approach creates a problem. Even if a database cursor is used to stream over the result set, as described above, the objects created from the result set by the mapping layer still live in memory after the database cursor is advanced, since the entire result set is still within the scope of the underlying transaction. The memory used to hold the subset of the result set rows is released when the database cursor is advanced, but the corresponding memory used to hold the objects is not. Thus, the original problem simply moves from one component of the system to another, and the size of the result set is again constrained by the amount of memory available.
There are a number of scenarios in which this problem is compounded. A first such scenario is when a database cursor is used for pre-loaded, or xe2x80x9cread-aheadxe2x80x9d, dataxe2x80x94that is, when the data returned from a database retrieval operation includes data for not only the root of an object graph (i.e. where xe2x80x9cobject graphxe2x80x9d refers to multiple objects, having interrelationships that form a graph structure) but also data for related objects that may be navigated to from the root. A second scenario is when a database cursor is used for retrieving a result set, and the application then retrieves related rows using independent queries. The problems resulting from these example scenarios will now be described in more detail.
As an example of the first scenario, consider an object model that has three classes Department, Employee, and Address, as shown in FIG. 2. Also assume that this object model is mapped to a relational database with a similar structure of tables, such that the database schema has three tables DepartmentTable, EmployeeTable, and AddressTable. A DepartmentTable row is related to one or more EmployeeTable rows (as indicated by the corresponding 1-to-many relationship depicted in FIG. 2) by a foreign key column in the EmployeeTable, where this foreign key specifies a DepartmentTable primary key. Also, an EmployeeTable row is related to one or more AddressTable rows in a similar manner. A cursored query can be constructed to return all DepartmentTable rows that match a certain criteria (such as xe2x80x9cDepartmentTable.Dept=xe2x80x9cD*xe2x80x9d) along with (1) the EmployeeTable rows which reference the selected department rows and (2) the AddressTable rows which reference the selected employee rows. In this case, there may be many rows of data for each root object (where in this example query, the root is a department) that is selected by the query. FIG. 3 illustrates a simple example where two departments xe2x80x9cD1xe2x80x9d and xe2x80x9cD2xe2x80x9d were selected by a cursored query. The corresponding root objects are D1300 and D2310. The result set 330 depicts three returned rows 340, 350, 360 for department D1 and two returned rows 370, 380 for department D2 (where this example result set represents a join of each of the three underlying tables and shows only a primary key value from each). As will be obvious, a result set will often contain many more rows and many more columns than illustrated in FIG. 3.
Using this example result set 330, the database cursor is first positioned 320a at the start of the result set. Suppose the mapping layer software described above structures the result set according to the root objects therein, such that the database cursor is advanced from one root object to the next. In this case, the mapping layer software will create objects representing rows 340, 350, and 360 and store these objects into the application""s memory. When the application traverses from department D1 to department D2, the database cursor moves to position 320b. The mapping layer software will then create objects representing rows 370 and 380. If more than two root objects are represented by the result set, this process of moving the database cursor and creating new objects for each subset of the result set is repeated. As can be seen from this simple example, processing a result set that represents a fairly large number of objects (as will be typical during operation of many applications) will consume large amounts of memory for storing those objects.
As an example of the second scenario described above, suppose a cursored query retrieves only the Department and Employee rows as described with reference to FIG. 3, but does not retrieve Address rows (such that the result set 330 in this example does not contain the information depicted in the third column of FIG. 3). Further suppose that the database schema contains a Project table (and that its corresponding object model contains a Project class) for the project(s) an employee works on. After issuing the cursored query, suppose the application then processes the Department and Employee objects created by the mapping layer from the result set, and may selectively retrieve additional information related to these objects. For example, Address data may be retrieved for Employees, foreign key values may be used to retrieve an employee""s manager row from the Employee table, project information may be retrieved for each Employee from the Project table, etc. Each such retrieval results in additional objects being created by the mapping layer, and each such related object further compounds the problem of managing the memory usage of the system.
Accordingly, what is needed is an improved technique for managing objects that are created for result sets retrieved in relational database operations.
An object of the present invention is to provide an improved technique for managing objects created in response to retrieving data from a relational database.
Another object of the present invention is to provide this technique by using a temporary object cache.
A further object of the present invention is to provide this technique by defining an object cursor which is mapped to the database cursor for a particular transaction.
Another object of the present invention is to provide this technique where the source of the retrieved data is a relational database, and where the destination of the data is an application written using an object-oriented programming language.
Other objects and advantages of the present invention will be set forth in part in the description and in the drawings which follow and, in part, will be obvious from the description or may be learned by practice of the invention.
To achieve the foregoing objects, and in accordance with the purpose of the invention as broadly described herein, the present invention provides a method, system, and computer-readable code for efficiently managing object life cycles in a computing environment. This technique comprises issuing a cursored query and processing one or more objects represented in one or more rows of a result set of the cursored query by selectively storing particular ones of the objects in a reusable object cache.
Processing the objects preferably further comprises: extracting the particular ones of the objects from the result set, wherein each of the extracted objects is related to a particular root object represented in the rows; storing the extracted objects in the object cache; positioning an object cursor to point to the particular root object; operating upon the stored objects, creating one or more changed objects; and advancing the object cursor. Advancing the object cursor further comprises writing the changed objects to a persistent object store and emptying the object cache following the writing operation. The processing may be repeated for each of one or more additional root objects represented in the rows. This repeating enables successively processing each of the root objects using the object cursor.
One or more additional objects related to one or more of the stored objects may be separately retrieved, and these separately retrieved objects may be added to the stored objects in the object cache.
The present invention will now be described with reference to the following drawings, in which like reference numbers denote the same element throughout.