A table within a database may be partitioned. If it is partitioned, rows are allocated to partitions based on a partition specification. Such a partition specification may for example be a string like “HASH 4 col1” where “HASH” identifies the used algorithm, “4” is the number of partitions and “col1” is the column of which the values are hashed to determining the target partition for a row.
When a table is created, it may directly be created as a partitioned table. It is also possible to split a non-partitioned table into a partitioned table. Partitioned tables may be re-partitioned again into a table with a different partition specification. In addition, the partitions of a table may be merged into a non-partitioned table. All of these transitions/operations can be referred to, unless otherwise specified, as “re-partitioning.”
Re-partitioning means that in most cases all data has to be moved from one location (partition) to another. Such operations are comparable to copy and delete operations. In order to be compliant with backup & recovery concepts, it is also required to write redo log information which doubles the amount of data that has to written to disk.
Overall such operations are very costly from an I/O, main memory and CPU perspective. Moreover, the actual re-partitioning of a column requires that for writing, all required source pieces are available on a local server. For some cases, this can be optimized: For example a table “HASH 2 col1” has two partitions that may be located on two servers. If the table is to be re-partitioned to “HASH 4 col1”, each of the servers can split the local partition locally as all required source data is present. This is different, if for example a table with “HASH 2 col1” is to be re-partitioned to “HASH 3 col1”. All source partitions have to be moved to a common location where the re-partition operation can then take place. The resulting three partitions then have to be moved to respective target servers. Moving, in this regard, requires that there is a server in the landscape which is big enough to hold all data of the table as the resources (I/O, main memory and CPU) will only be utilized on that single server.
In addition to the high resource consumption and the need to move everything to a common location, there is one more aspect which is often the most critical one: While a table is re-partitioned, an exclusive lock is held on the table which prevents write operations. Only read operations are possible during re-partitioning.
Typically only the big tables are subject to re-partitioning. With such data tables, the data volume is high and hence re-partitioning can often require up to one hour or longer to finalize. During normal operation of the database while the table is in use, this prolonged amount time is not acceptable for both OLTP and OLAP processing. Therefore, given such time constraints, it is highly recommended to perform re-partitioning of huge tables only during a downtime.