1. Field of the Invention
Embodiments of the present invention generally relate to backup techniques and, more particularly, to a method and apparatus for routing a data stream through a plurality of data movers independent of a network interface type to optimize load balancing.
2. Description of the Related Art
In a typical computing environment, small to large sized organizations utilize various technologies, such as a data storage system, to store and protect mission critical data. The data storage system, generally, includes a plurality of data movers and an array of physical disk drives (e.g., ATA disks, Fibre channel disks, a magnetic tape library and any other data storage device) that facilitate data backup and/or restoration. A data mover, in any type of the data storage system, refers to the function (e.g., a process) that is able to push or pull (e.g., send or receive, respectively) data over a plurality of data paths between various computing environments (e.g., various platforms, protocols, systems and the like).
The data movers, generally, include data transfer systems, devices and/or software that utilize the capabilities of the data storage system (e.g., data backup, duplication and/or restoration processes) to quickly and reliably route the mission critical data from one location (e.g., a client computer, a database and the like) to another location (e.g., tape library, disk drives and the like) through a network interface. For example, a data movers may read the mission critical data from one data storage device and then, transfer the mission critical data to another data storage device.
The mission critical data may be lost and/or corrupted due to various system failures or a virus attack. As such, the mission critical data may be backed up on a regular basis (e.g., continuously) to the one or more storage devices (e.g., a tape drive, a hard disk drive and/or the like). In conventional backup techniques, the mission critical data is routed through a single network interface or data path. In other words, each data block of the mission critical data is transmitted over the same data path regardless of an input/output (I/O) load and/or another better performing data path. Consequently, the single data path is congested and becomes a bottleneck for routing the mission critical data from a computer to the one or more storage devices.
There are one or more technologies that leverage two data paths to communicate a data stream between the client and a single data mover for a backup process. Such technologies, however, operate at a network layer (e.g., the network layer of Open System Interconnection (OSI) or Internet layer of TCP/IP). If the single data mover fails during transmission, the data stream is lost. Furthermore, the backup process also fails and must be restarted. Additionally, if any of the two data paths fail during transmission, the data stream is also lost if the backup process cannot be failed over to the other data path and/or cannot be retried. For example, the backup process may employ a data transmission protocol that does not permit retries after such a failure.
Unfortunately, error recovery solutions are limited to coarse-grain checkpoint restart mechanisms, which locate a point-in-time at which the backup process was interrupted and restarts the backup process from that point-in-time. Moreover, such technologies cannot enable fine granularity for the error recovery solutions if the data stream is sent as a completely separate archiving (.TAR) file. As a result, the conventional backup techniques are unable to provide a reliable and efficient backup of the data stream over multiple data paths and suffer from network bandwidth and throughput constraints.
Therefore, there is a need in the art for a method and apparatus for routing a data stream through a plurality of data movers over a plurality of data paths independent of a network interface type to optimize load balancing.