An effective storage subsystem is a critical concern in the computer system industry. One of the most favored storage subsystems to achieve fault tolerance and enhance data availability may be redundant arrays of independent disks or redundant arrays of inexpensive disks (RAID) systems. Fault tolerance in a storage subsystem is generally achieved either by disk mirroring or by parity encoding, which various levels of RAID may provide.
It is well known to the art that “write journaling” has been utilized by RAID systems. Generally, “write journaling” refers to the concept of having a file system write a ‘diary’ of information to the disk in such a way as to allow the file system to be quickly restored to a consistent state after a power failure or other unanticipated hardware/software failure. A journaled RAID system can be brought back on-line quickly after a system reboot, and, as such, is a vital element of building a reliable, available storage solution. Further, file volumes in a distributed system (e.g RAID system or the like) are often modified in different manners by various host computers. Ensuring the synchronization is important to provide data integrity over the distributed file system. Thus, some RAID system may log (store) write journal information in non-volatile random access memory (NVRAM) to ensure the synchronization of a file volume is maintained and to restore the file volume after unexpected system interruptions or failures.
However, the conventional “journaling” method utilized by RAID systems has inherent problems. As described above, in order to ensure that write journal information is not lost due to unexpected system interruptions or failures, the write journal information is stored in some form of non-volatile memory (e.g. NVRAM or the like). However, storage on most non-volatile memories is limited and thus it is undesirable to store large amount of write journal information on non-volatile memories. Further, access to certain types of non-volatile memory may be slow. Thus, it is preferable to reduce the number of read/write operations to such non-volatile memories. In particular, the RAID system prevents the host from reading data that is not yet synchronized since erroneous information may be returned to the host computer. Thus, when the RAID system with write journaling is initialized, write journal entries may need to be processed to restore the file volume's synchronization. In such a case, host read I/O command may be blocked while the write journal is processed.
Conventionally, there are numerous approaches to resolve the above-mentioned problems. One example may be an approach to store the write I/O's virtual Logical Block address (LBA) and the number of sectors to write prior to each write operation. This data (write journal data) is then cleared upon the write operation. This approach may not be desirable for a system where non-volatile memory (e.g. NVRAM or the like) is very limited. For example, write journal data may require 8 bytes of data for each outstanding write I/O for a 10 byte SCSI command. Thus, if 64 write operations are permitted to be outstanding, this would require 521 bytes of non-volatile random memory access (NVRAM). The more I/O's that are required to be outstanding, the more NVRAM is required. In the above example, each write I/O would require 12 write accesses to NVRAM-8 bytes upon receipt of the host write operation and a 4 byte write to mark the write journal as complete. On a system where process time is critical, 12-byte write accesses for each write I/O may incur an unacceptable time penalty.
As such, current methods for dealing with this require every write journal entry to be processed before allowing any host read I/O's to be issued. If the number or size of write journal entries is large, this can take an unacceptably long period to complete. Other methods may use multiple structures to identify regions that need to be corrected by a write journal entry. This may not be acceptable as well due to the amount of additional memory required.
Therefore, it would be desirable to provide a method and system to overcome above mentioned drawbacks of current write journaling methods.