1. Field of the Invention
This invention relates in general to the reorganization of data, and in particular to developing a cost-effective reorganization plan for reorganizing data of a database.
2. Description of the Related Art
A database is generally a collection of information organized such that computer programs can quickly access desired portions of the information. The information is typically referred to as data and the computer program enabling a user to enter, modify, delete, or otherwise organize select data in the database, is often called a database management system (DBMS). Generally, DBMSs govern the internal organization of data. For example, the terms xe2x80x9crelational,xe2x80x9d xe2x80x9cnetwork,xe2x80x9d xe2x80x9cflat,xe2x80x9d xe2x80x9chierarchical,xe2x80x9d and the like, all refer to ways differing DBMSs organize data. Often, these different internal structures affect how quickly and flexibly data can be extracted from a database. The foregoing notwithstanding, most modern, general-purpose moderate to high-volume databases are designed based on what is known in the art as a Relational Model, and are generally referred to as relational DBMSs. Because the instant invention is relevant to, among other things, a wide number of database platforms, including database management systems (DBMS) and relational DBMSs (RDBMS) systems, these platforms and systems will hereinafter simply be referred to as xe2x80x9cDBMS.xe2x80x9d
With the advent of highly sophisticated DBMSs, such as those offered from Oracle Corporation, IBM, or the like, users are demanding virtually non-interrupted access from virtually anywhere to applications and data. Moreover, users are demanding that accesses occur at peak or near peak performance levels. However, as users or applications, hereinafter referred to as xe2x80x9capplications,xe2x80x9d continually modify the data stored in a particular database, the performance of that database is often adversely affected. For example, as objects of a database grow and shrink in size, the database becomes fragmented, resulting in the database having a large footprint with excessive free space. Moreover, the changing objects may generate chained rows, resulting in a need for multiple disk accesses to acquire desired data. Thus, fragmentation and chaining generally increase data access times and decrease database performance. In order to maintain the accessibility and performance of a particular database, administrators often need to reorganize the data contained therein.
One solution for reorganizing data within a database involves use of storage areas reserved for DBMS use rather than operating system storage. The memory areas are generally referred to as a scratchpad area of memory or scratchpad storage. According to the solution, the data of an original database is completely copied to the scratchpad storage, then copied back to the original location of the original database. The process of copying the data allows the DBMS to remove chained rows and fragmentation from objects and to reduce fragmentation of unallocated storage in the DBMS.
The foregoing reorganization solution suffers from a variety of drawbacks. For example, in order to completely copy each object of the original database to the scratchpad storage, the DBMS needs a scratchpad storage at least as large as the original database. As the size of databases grow, the practicality and likelihood that large enough scratchpad storage will be available decreases. In addition, movement of data within a DBMS is generally very costly in terms of time. Because the foregoing reorganization solution copies data to the scratchpad storage, and then back to the original database, the amount of data that will be moved is twice the size of the original database. Similar to the foregoing, as the size of databases increase, the cost in terms of reorganization time increases at double the rate.
In addition to the foregoing drawbacks, the reorganization solution reorganizes entire databases, irrespective of whether certain objects within the database are even in need of reorganization. For example, while some objects within a database may have experienced dramatic changes in the size of the data stored therein, other objects within the database may not have had data added, modified, or deleted since allocation or the last reorganization. Therefore, some objects of the database may be in great need of reorganization, while others are not.
Based on the above, a need exists for a faster, less-costly, reorganization solution. In addition, a need exists for a reorganization solution that accounts for the varying need of reorganization among varying objects within a particular database. Accordingly, the present invention includes a reorganization planner that develops a reorganization plan by employing concepts of benefits and costs to the reorganization of objects of a database. According to another embodiment, the reorganization planner, when feasible, performs in-place reorganization of objects. According to one embodiment, in-place reorganization may advantageously reduce the cost of reorganization by at least half.
In addition, according to one embodiment of the invention, the reorganization planner develops the reorganization plan by scheduling the order of reorganization of the objects within a database according to a benefit-cost ratio. Moreover, according to additional aspects of an embodiment of the invention, the reorganization plan includes a benefit threshold, such that objects below the benefit threshold are not recommended for reorganization. The foregoing reorganization planner advantageously allows a system administrator to properly prioritize objects to be reorganized.
According to one embodiment, maintenance operations, such as, for example, reorganizing objects, often make various parts of the database inaccessible to the general user population for a given period of time. That period of time is typically referred to as the batch processing window, and often occurs at night or on the weekends. Thus, by allowing the system administrator to properly prioritize, for example, the reorganization of objects, the reorganization application advantageously achieves much greater benefits during a given batch processing window than could be accomplished through arbitrarily choosing objects to reorganize. The foregoing greater benefit translates into increased database performance (in terms of greater data throughput and faster query response time) to the general user population subsequent to the batch processing window.
Therefore, one aspect of the invention includes a method of developing a plan to reorganize a tablespace that increases the benefit derived from the reorganization while decreasing the cost associated with the reorganization. The method comprises determining an object benefit for each of multiple objects of a tablespace, where each object benefit is associated with the reorganization of one of the multiple objects. The method also includes determining an object cost for each of the multiple objects, where each object cost is associated with the reorganization of one of the multiple objects. The method also includes, for each object, generating a benefit-cost indicator which accounts for object benefit and the object cost associated with the reorganization of the object, and developing a plan for reorganizing the tablespace based on at least the benefit-cost indicator.
Another aspect of the invention includes a method of generating a plan for reorganizing a database. The method comprises determining the benefits and costs of reorganizing one or more objects within a database, and ordering the reorganization of the one or more objects based on the benefits and costs, thereby reorganizing objects having a higher benefit and a correspondingly lower cost before reorganizing objects having a lower benefit and a correspondingly higher cost.
Another aspect of the invention includes a reorganization planning system. The reorganization planning system includes one or more database files storing one or more tablespaces having one or more objects and a system table having information on the one or more objects, and a database management system communicating with the database files, thereby governing the modification of the one or more objects. The reorganization planning system also includes a reorganization planner communicating with the database management system to access system table information associated with the one or more objects, thereby determining benefits and costs of reorganizing at least one of the one or more objects. Wherein the reorganization planner schedules the reorganization of the at least one of the one or more objects based on the benefits and costs.
Another aspect of the invention includes a method of generating a plan for reorganizing data. The method comprises determining a benefit-cost indicator for each of multiple groups of data, where each benefit-cost indicator is associated with reorganizing of at least one of the groups of data stored in a storage medium. The method also includes generating a plan for reorganizing the groups of data based on the benefit-cost indicator, thereby reorganizing at least one group of data having a higher benefit-cost indicator before another group of data.
Another aspect of the invention includes a method of developing a plan to reorganize a tablespace that increases the benefit derived from the reorganization while decreasing the cost associated with the reorganization. The method includes for multiple objects in a tablespace, calculating a benefit and a cost derived from reorganizing each of the multiple objects, where the calculation includes information related to characteristics of the multiple objects. The method also includes calculating an indicator corresponding to each of the multiple objects, where the indicator is greater for objects having a larger benefit per unit cost and is less for objects having a lower benefit per unit cost. The method also includes selecting the object having the best indicator, and developing a plan for reorganization based on the indicator.
Another aspect of the invention includes a software system designed to plan the reorganization of multiple objects stored in a database. The software system accesses the multiple objects through a database management system that tracks multiple characteristics for each of the multiple objects. The software system comprises software commands which access system tables of a database to read values of characteristics of multiple objects of the database, the values providing an indication of data fragmentation in the database. The software system also comprises software commands which calculate benefits and costs associated with the reorganization of the multiple objects from the values. The software system also comprises software commands which iteratively select ones of the multiple objects having a higher benefit-cost ratio than the other objects, and software commands which include the selected ones of the multiple objects into a plan for reorganizing one or more of the multiple objects.
Another aspect of the invention includes a method of planning the reorganization of multiple objects of a tablespace by lowering the amount of data movement associated with the reorganization. The method includes determining a measure of contiguous free space among multiple objects of a tablespace, the multiple objects storing data. The method also includes for at least one object of the multiple objects, determining whether sufficient contiguous free space exists within the tablespace to recreate the object, and when the sufficient contiguous free space exists within the tablespace, developing a plan for reorganizing the object.
Another aspect of the invention includes a method of planning the reorganization of multiple objects of a tablespace by lowering the amount of data movement associated with the reorganization. The method comprises determining a measure of contiguous free space among multiple objects of a tablespace. The method also includes for at least one object of the multiple objects, determining whether sufficient contiguous free space exists within the tablespace to recreate the object. The method also includes when the sufficient contiguous free space does not exist, developing a cost effective plan for moving one or more other objects from the tablespace in order to create sufficient contiguous free space to recreate the object.
Yet another aspect of the invention includes a method of creating contiguous free space within a tablespace for use during the reorganization of an object of the tablespace by temporarily moving one or more other objects out of the tablespace. The method includes determining the benefits and costs of temporarily moving one or more select objects out of a database, and based on the benefits and costs, temporarily storing at least one of the one or more select objects outside the tablespace, thereby creating contiguous free space within the tablespace. The method also includes recreating at least one other object in the contiguous free space.
Another aspect of the invention includes a method developing a plan for reorganizing select objects within a database while excluding from the plan other objects based on the benefits and costs associated with reorganization. The method includes determining an object benefit for each of multiple objects of a database, each object benefit being associated with the reorganization of one of the multiple objects. The method also includes determining an object cost for each of the multiple objects, each object cost being associated with the reorganization of one of the multiple objects, and for each object, generating a benefit-cost indicator which accounts for object benefit and the object cost associated with the reorganization of the object. The method also includes determining a total benefit associated with reorganizing each of the multiple objects of the database, and developing a plan for reorganizing the database. Wherein the plan includes those of the multiple objects whose aggregate object benefits account for a threshold value of the total benefit.
These and other aspects, advantages, and novel features of the invention will become apparent upon reading the following detailed description and upon reference to the accompanying drawings. In the drawings, same elements have the same reference numerals.