Cloud computing provides a shared pool of configurable computing resources (e.g., computer networks, servers, storage, applications and services) that can be rapidly provisioned and released with minimal management effort. Cloud computing allows users with various capabilities to store and process their data in either a private cloud or public cloud (third-party owned) in order to make data accessing mechanisms much more easy and reliable. Large-scale cloud computing infrastructure and services are often provided by cloud providers that maintain data centers that may be located long distances from many of the users.
Cloud networks are widely used for large-scale data backup operations by enterprises that process large amounts of data on a regular basis, such as weekly or daily company-wide backups. Data deduplication is a form of single-instance storage that eliminates redundant copies of data to reduce storage overhead. Data compression methods are used store only one unique instance of data by replacing redundant data blocks with pointers to the unique data copy. As new data is written to a system, duplicate chunks are replaced with these pointer references to previously stored data. Though storage requirements are greatly reduced, processing overhead is increased through the compression processes of deduplication.
During heavy network traffic sessions, such as large-scale data migration on the cloud, the network throughput and IOPS (Input/Output Operations Per Second) can fluctuate significantly based on various factors at the cloud provider's end and the intermediate network. For instance, present known cloud providers can cause excessive retries during heavy usage. Even in private clouds, an excessive amount of load on the cloud provider's load balancer can lead to slow down and eventually errors prompting the sender to slow down data traffic. A cloud provider may also rate-limit incoming requests in order to honor the SLAs (service level agreements) for all users. Given these conditions, having a static number of connections would mean underutilizing the bandwidth available or having to perform excessive retries when communicating with the cloud provider. In present systems, network timeouts or errors from the cloud provider forces a retry all the requests using an exponential back-off interval. Since the number of connections is not reduced in this case, the cloud provider can continue to be overwhelmed by the high request rate. Under such conditions retrying after backing off typically does not help and subsequent retry attempts have a higher likelihood of failing.
What is needed, therefore, is a system and method that maximizes network throughput in cloud networks by dynamically tuning the number of connections used during data transfer with the cloud provider.
The subject matter discussed in the background section should not be assumed to be prior art merely as a result of its mention in the background section. Similarly, a problem mentioned in the background section or associated with the subject matter of the background section should not be assumed to have been previously recognized in the prior art. The subject matter in the background section merely represents different approaches, which in and of themselves may also be inventions. EMC, Data Domain, Data Domain Restorer, and Data Domain Boost are trademarks of Dell EMC Corporation.