1. Field of the Invention
The present invention generally relates to a method and system for database partitioning. More particularly, this invention is related to a method and system for dynamic database partitioning.
2. Description of Related Art
In current database systems, when a user's application data is loaded, data within a database will become extremely huge as the user's business grows. For example, a table which contains invoice records over ten years records can include a few million of lines of data. Querying such a huge table is very time-consuming.
In order to improve the speed and efficiency of database queries, database partitioning technology emerged. In particular, the database in a non-shared environment is broken down into separate partitions, and each partition has its own resources. Data is distributed to different partitions through Hash algorithms, and each partition is only responsible for dealing with their own data. When a user issues a database query, certain partitions can be used to process the request. The main purpose of the database partition is to improve the data-parallel processing performance, and to reduce the amount of data read/writes in specific database query operations in order to reduce the response time.
Database partitions can be a physical partition or a logical partition. Physical partitions refer to a partition deployed on a single physical machine, while logical partitions are partitions deployed on multi-processing system (Symmetric Multi Processing, referred to as SMP) machines. The database can be implemented on either a physical partition or a logical partition. If a large SMP machine is used, the database will usually adopt a logical partition; if more than one physical machine is used, the database will usually adopt physical partitions through a non-shared architecture; if multiple SMP machines are used, the database can adopt both physical and logical partitions.
Database partitioning technology has two advantages. First is the increased database performance. Continuing the above example, the invoice records which account for several millions of lines of data can be broken into ten different partitions, each containing one year's worth of records. Query time is then reduced to only one-tenth of that for a full table scan.
Second is the simplified data management. Partitioning technology can leverage the extensive experience of database administrators to enhance data management capabilities. Through the use of partitions, the administrator can simplify the implementation of specific data.
However, if a database has already been partitioned (either by a database vendor or by a developer), the administrator can not further modify the partition rules in subsequent database operations. Initialized partition methods normally only support partitions based on a hash table. This partitioning scheme has nothing to do with the application data. For example, the database can not partition by year in above example, which makes the database lack partition flexibility.
To make up for this deficiency, many database users write application code in order to achieve database partition when initializing partitions in the database. However, it is very difficult to achieve flexible data partitioning by writing application code because they normally require help from application developers. However, transaction based applications are becoming more and more popular. For most database applications, the administrator can not design a proper partitioning scheme before the database has been extensively used. As a result, with the continuous growth of business and the increasing amount of data, the database response time will rise over time, which can lead to an unacceptable user experience. In practice, the recent data query rates can be far higher than the historical data query rate. Partition rules set by the database administrator at very beginning need to be changed over time. For example, the dividing year needs to be changed from 2009 to 2010 one year later, which means data before and after 2010 should be recorded in two different partitions.
Therefore, although the prior art provides database partitioning, it can not achieve flexible data partitioning without modifying the current application code.