The invention relates to data storage systems and methods. More particularly, the invention relates to a system and method of managing data within a computer accessible storage array.
The use of an array of disks for computer-based data storage is known. One category of disk arrays is referred to as Redundant Array of Inexpensive Drives (RAID). Within a RAID system, varying levels of data storage redundancy are utilized to enable reconstruction of stored data in the event of data corruption or disk failure. These various types of redundant storage strategies are referred to as RAID levels.
For example, RAID level 1, also referred to as xe2x80x9cthe mirror methodxe2x80x9d, defines data which is stored with complete redundancy, typically permitting independent, simultaneous access to all copies of the data set. RAID level 6, also referred to as xe2x80x9cthe parity methodxe2x80x9d, defines data storage which utilizes bit-parity information, generated by way of an Exclusive-OR operation, to create a parity data set. The parity data set may be used to reconstruct original data (i.e., now corrupt) using another Exclusive-OR operation. In comparison, RAID level 1 provides relatively fast, simultaneous access to multiple copies of the same data set; while RAID level 6 provides for greater storage media efficiency than that of RAID level 1. Accordingly, RAID level 1 is considered a xe2x80x9chigh performancexe2x80x9d or xe2x80x9chighxe2x80x9d RAID level as compared to level 6.
Due to the higher performance of RAID level 1 storage, it is desirable to keep the most frequently accessed data within RAID level 1, to the extent that physical storage resources permit. Toward this goal, some RAID management systems employ techniques in which more frequently accessed data is moved to RAID level 1 storage, while less frequently accessed data is shifted to RAID level 6 (or other RAID levels). RAID data storage is discussed in detail in U.S. Pat. Nos. 5,392,244 and 5,664,187, which are incorporated herein by reference.
The vast amounts of data that are stored in many computer systems, as well as the ever-growing demand to access that data, has pushed system developers to seek ways of providing fast, multiple-user access to the data. One technique utilizes a system of pointers (i.e, mapping) to point at the data within the storage media. Software applications make use of the pointers to access relevant data within the storage system. A system of pointers may be used in combination with various RAID levels of storage.
The RAID data storage can keep track of the most recently written data blocks. This set of most recently written blocks is called the xe2x80x9cWrite Working Setxe2x80x9d. The pointers to these data blocks are kept in memory. The data blocks that make up the Write Working Set are kept in the physical storage. The Write Working Set does the mapping of the most recently written blocks to the physical storage. The data blocks that comprise this set can be in both RAID levels, this is, some of the blocks that are part of the Write Working Set can be in RAID level Six and the other part can be in RAID level One. Of course, it could be the case that all the data blocks are either in RAID level Six or in RAID level One. Whatever the case, the purpose of keeping the Write Working Set is to realize a list of pointers to the most recently written data blocks, so as to facilitate future migrations of those data blocks to the RAID level 1 storage. This gives the user better response time when accessing these data blocks again. The use of the Write Working Set is already in use and it is not the innovation being described here in this document.
One of the features of the RAID data storage is the possibility to preserve data as it was at some point in time. This feature is called xe2x80x9csnapshotxe2x80x9d, from now on, in this document, the snapshot term will be used to denote this feature. The data to be preserved is said to be xe2x80x9csnappedxe2x80x9d. After the data is snapped, it can be considered protected against future changes. This allows the user to take a xe2x80x9csnapshotxe2x80x9d of some data to be preserved. If the data is updated (written to) in the future then the snapped data is preserved and a new copy with the updated data is also stored in the system. This way, the user can have the snapped (i.e. the original) data and the updated data both in the RAID data storage. The snapping of the data is independent of the RAID level the data is in. When some data is snapped the data will be preserved regardless of the RAID level the data is in. The system of pointers to the snapped data blocks is known as the xe2x80x9cSnapshot Mapsxe2x80x9d. This idea as well as the Write Working Set described above is not an innovative idea and it is not the idea described here in this patent disclosure.
However, when system user operations require that the snapped data be updated (written to), it is necessary to follow a sequence of steps to ensure the snapped data will be preserved and the updated data will be written to. The procedure used to preserve the original data and store the updated data separately in the physical storage is called a xe2x80x9cdivergencexe2x80x9d. Both, the snapped and updated data will be kept in the RAID data storage. The steps performed in divergence are: 1) The data to be written to is stored in memory. 2) The snapped data is read, usually as a data block bigger than the data to be written to. 3) The data block with the snapped data is merged with the new data in memory and the combined data is then considered the updated data. 4) A new pointer to the updated data is created in the general tables that hold the maps for all the RAID level One and RAID level Six data. The pointer to the snapped (i.e. original) data is kept in the snapshot maps. 5) The updated data is written to in the physical storage. At the end of these first four steps the snapped data is still in the physical RAID data storage as well as the new, updated data. This procedure (the divergence), to ensure that the snapped data and the new data are kept in the physical storage, is performed in the foreground when the users writes to the data and results in undesirable delays from the perspective of the system users.
Therefore, it is desired to provide a data management system in which the update of snapped data is performed with a reduced impact on user access to the data stored within the system.
The invention provides an improved system and method for managing the divergences of snapped data within a RAID storage system, in coordination with the data access requests of system users. The preferred embodiment of the invention provides divergence in the background of the data blocks that are snapped AND are in a write working set (i.e., a map to those data spaces most recently written to). Therefore, the preferred embodiment of the invention involves predicting the divergences of the snapped data blocks by using the write working set.
One embodiment of the invention provides a system for managing a data storage array, comprising a plurality of data storage disks configured as an array, and a RAID controller coupled to each of the data storage disks within the array and configured to access data within the array and to measure the rate at which data within the array is being accessed by a user application, the RAID controller further including a memory and a user interface coupled to the RAID controller, the RAID controller being further configured to store a set of pointers within the memory, the pointers respectively corresponding to blocks of data within the array, each pointer being accessible by the user application, the RAID controller being further configured to anticipate a data modification operation to a particular block of data in the array responsive to the measuring and to selectively copy the particular block of data to a different block of data within a different RAID level storage location within the array responsive to the anticipating.
Another embodiment of the invention provides a method of managing a data storage system, comprising providing an array of data disks configured to store data, providing a RAID controller coupled to the array, providing a user interface computer coupled to the RAID controller, running an application program using the user interface computer, reading data stored within the array using the application program and the RAID controller, assembling and storing a pointer corresponding to the read data using the RAID controller, accessing the data within the array by way of the pointer using the application program and RAID controller, measuring the rate of the accessing using the RAID controller, anticipating a data modification operation to particular data in one RAID level within the array in response to the measuring using the RAID controller, selectively copying the particular data to another RAID level within the array in response to the anticipating using the RAID controller, and performing the data modification operation to the copied data within the array using the RAID controller, the copying being performed as a background operation using the RAID controller.
Another embodiment of the invention provides a RAID controller comprising a memory configured to store data, monitoring electronics configured to measure a rate at which an array of data storage disks coupled to the controller are accessed using the controller, and circuitry coupled to the memory and the monitoring electronics and configured to access data at different RAID levels within the array in response to corresponding instructions from a computer, the circuitry being further configured to selectively configure a pointer related to the data in the array in response to a corresponding instruction from the computer, the circuitry being further configured to anticipate a data modification operation in response to the measuring, the circuitry being further configured to selectively copy a portion of the data in one RAID level within the array to a different RAID level within the array as a background operation in response to the anticipating using the memory.
Another embodiment of the invention provides a computer readable medium comprising computer readable code, the computer readable code configured to cause a RAID controller to read data within an array of data storage disks coupled to the RAID controller in response to corresponding instructions from a computer, selectively configure a pointer related to the read data, access data within the array using the pointer in response to corresponding instructions from the computer, anticipate a data modification operation in response to measuring a rate at which the data within the array is accessed, and selectively copy a portion of the data in one RAID level within the array to a different RAID level within the array as a background operation in response to the anticipating.
Still another embodiment of the invention provides a RAID controller, comprising a memory configured to store data, monitoring electronics configured to measure a rate at which an array of data storage disks coupled to the RAID controller are accessed using the RAID controller, and firmware bearing computer readable code, the computer readable code being configured to cause the RAID controller to read data within the array in response to corresponding instructions from a computer, selectively configure a pointer related to the read data, access data within the array using the pointer in response to a corresponding instruction from the computer, anticipate a data modification operation in response to measuring the rate using the monitoring electronics, and selectively copy a portion of the data in one RAID level within the array to a different RAID level within the array as a background operation in response to the anticipating using the memory.
Another embodiment of the invention provides a computer readable medium comprising computer readable code, the computer readable code configured to cause a computer to read data within different RAID levels within an array of data storage disks coupled to the computer in response to corresponding instructions sent from the computer to a RAID controller, selectively configure a plurality of pointers related to the read data, the pointers being stored within a memory of the computer, access data at different RAID levels within the array using the pointers in response to corresponding instructions sent from the computer to the RAID controller, measure a rate at which the computer is accessing the data within the array, anticipate a data modification operation in response to the measuring, and selectively instruct the RAID controller to copy a portion of the data in one RAID level within the array to a different RAID level within the array as a background operation in response to the anticipating such that the measuring, anticipating, and copying steps occur from time to time.
Another embodiment of the invention provides a method of managing a data storage system, comprising providing an array of data disks configured to store data, providing a RAID controller coupled to the array, providing a computer coupled to the RAID controller, providing a computer readable medium having computer readable code in cooperative relation to the computer, running the computer readable code using the computer, reading data stored within the array using the computer readable code and the computer and the RAID controller, assembling and storing a pointer corresponding to the read data within a memory of the computer using the computer readable code and the computer, accessing the data within the array by way of the pointer using the computer readable code and the computer and the RAID controller, measuring the rate of the accessing the data using the computer readable code and the computer, anticipating a data modification operation to particular data in one RAID level within the array in response to the measuring using the computer readable code and the computer, selectively copying the particular data to another RAID level within the array in response to the anticipating using the computer readable code and the computer and the RAID controller, and performing the data modification operation to the copied data within the array using the computer readable code and the computer and the RAID controller, the copying and the data modification being performed as a background operation.