Mirroring involves the process of maintaining identical copies of data on separate storage devices. FIG. 1 illustrates an exemplary data processing system 100 that employs mirroring. Data processing system 100 includes a node 102 coupled to clients 104-110 via a network 112. Node 102 is also coupled to a primary data storage device 114 and a mirror 116 via a network 118. A mirror is a data storage device on which an identical copy of data from a primary storage device is maintained (as such, primary storage device 114 may also be referred to as mirrored storage device 114).
Primary storage device 114 and mirror 116 store identical copies of data. In other words, at least a portion, if not all, of the data stored on primary storage device 114 is identical to at least a portion, if not all, of the data stored on mirror 116. Primary storage device 114 and mirror 116 are separate data storage devices. Each of primary storage device 114 and mirror 116 may include one or more magnetic or optical disk arrays and/or one or more redundant arrays of independent disks (RAID) and RAID controllers.
Node 102 may be a computer, such as server, that provides services to and/or manages resources of devices coupled to networks 112 and 118. Node 102 includes one or more processors 120 configured to execute instructions that reside on computer readable media (not shown). Node 102 also includes a data storage manager 122. Data storage manager 122 may be in the form of instructions residing on computer readable media which direct processor 120 to perform specific steps.
One example of data storage manager 122 is a volume manager. A volume manager operates to provide storage virtualization. For example, with storage virtualization, data storage manager 122 may present primary storage device 114 and mirror 116 to clients 104-110 as a virtual disk 124. From the viewpoint of clients 104-110, virtual disk 124 is equivalent to one physical data storage device (a virtual disk may also be referred to as a volume). In providing storage virtualization, data storage manager 122, rather than clients 104-110, handles the distribution of data across primary storage device 114 and mirror 116. Although data storage manager 122 may take the form of a volume manager, the functions of data storage manager 122 may be spilt between a volume manager and a file system (not shown) residing on node 102. The functions of data storage manager 122 may also be integrated into a file system residing on node 102.
Data storage manager 122 also enables mirroring within data processing system 100. For example, to create mirror 116, data storage manager 122 selects all or a portion of data on primary storage device 114 and copies the selected data to mirror 116, making sure to account for read and write requests from clients 104-110 during the copy process (e.g., by queuing the requests, processing the requests concurrently with the copy process, etc.).
In order to keep the copy of data stored on mirror 116 identical to the respective data of primary storage device 114, data storage manager 122 issues simultaneous write operations to primary storage device 114 and mirror 116 for each write request received from a client 104-110. For example, when data storage manager 122 receives a write request from a client 104-110 to write data to virtual disk 124, data storage manager 122 generates two write operations: one write operation to write the data to primary storage device 114, and a second write operation to write the data to mirror 116. Primary storage device 114 and mirror 116 receive their respective write operation and respond by writing the requested data. By issuing simultaneous writes in this manner, the data on mirror 116 is kept identical to respective data on primary storage device 114.
Data of mirror 116 may be synchronized with (i.e., maintained as identical) data of primary storage device 114 either synchronously or asynchronously. In synchronous operation, any data modification to primary storage device 114 will immediately be propagated to mirror 116. In, asynchronous operation, all data modifications to primary storage device 114 are logged and periodically transmitted to mirror 116 for updating the data on mirror 116 at a later time. The data modifications may be logged in a log file stored on node 102 or, alternatively, may be handled by a logging device integrated with, or coupled to, node 102. Asynchronous operation is typically implemented when primary storage device 114 and mirror 116 are a considerable distance apart from each other. The data of primary storage device 114 and mirror 116 may be substantially identical at times, recognizing that there may be a delay between the time data is written to primary storage device 114 and when mirror 116 is updated with the data.
Mirroring proves useful for employing redundant data storage devices within data processing system 100. With the use of redundant data storage devices, organizations such as financial institutions, data storage providers, insurance companies, etc., can minimize the downtime associated with a failure of one of the data storage devices on which their business data is stored. For example, should primary storage device 114 fail, a replacement primary data storage device may be hot-swapped in the failed device's place and data from mirror 116 may be copied over to the new primary data storage device. All the while, node 102 may satisfy read and write requests from clients 104-110 via data on mirror 116.
In addition to providing redundancy, mirroring also allows for the load balancing of data across multiple data storage devices, off-line analysis of production data, enabling off-line back-ups, and disaster recovery. Yet in spite of its many uses, traditional implementations of mirroring provide little, if any, benefit in the way of I/O performance.