Data backup involves copying data from a source system to a backup storage media/system, sometimes referred to a backup “target”. Data sets have grown to be increasingly large, and business requirements of high availability have resulted in an ever increasing need to be able to back up more and more data ever more quickly.
To increase throughput, many backup solutions support using multiple streams to send data from a source system to a backup storage node or other target system. Some types of data set, e.g., large database files such as SQL database files, may be stored on systems that support “striping” or other techniques to enable multiple portions of a stored object or other data set to be sent in parallel to a backup target system.
Backup storage systems, e.g., Data Domain® de-duplicating storage systems, support multiple concurrent connections by multiple threads. However, each model and configuration (e.g., processor type, amount of memory, etc.) may have a different number of connections that can be supported. For example, a Data Domain® model DD860 with 36 GB of memory advertises “soft” limits of 90 concurrent backup write streams, 50 concurrent backup read streams, 90 concurrent replication streams, and a “hard” limit of 149 write streams.
If a backup application attempts to perform a backup with a degree of parallelism that exceeds the capacity of the target system, one or more of the backup streams may “hang” or otherwise fail to be performed successfully. For example, if a user were to configure a backup application to back up a SQL instance that has 10 databases using a striping value of 16 stripes per database, the result would be an attempt to use 160 streams concurrently to back up the SQL instance, which would exceed both the 90 stream soft limit and the 149 stream hard limit of the Data Domain® target system mentioned above. The backup of the first nine databases might begin and proceed successfully, since they would require 9×16=144 connections, but the backup of the tenth database would hang or otherwise fail, as the hard limit on the number of write sessions the target system is able to support was exceeded.
The challenge becomes more complicated in settings in which multiple and potentially dissimilar backup applications (sometimes referred to as “data movers”) are configured to use the same set of backup storage nodes (target systems), since demands placed on a target system by one data mover may not be taken into consideration by other data movers.
In the typical approach, a conservative limit may be set by an administrator on the number of concurrent streams that may be used by the backup application on the source side. For example, if the soft limit on a typical target system is understood to be 90 write streams, the administrator may set the source side limit to be 60 concurrent write streams. This approach, however, may result in unused capacity at the target system, resulting in longer backup windows than may otherwise be required, and/or could still result in limitations of the target systems being exceeded, e.g., if other data movers use the same target systems.