Many schemes have been developed to protect data from loss or damage. One such scheme is hardware redundancy, such as redundant arrays of independent disks (RAID). Unfortunately, hardware redundancy schemes are ineffective in dealing with logical data loss or corruption. For example, an accidental file deletion or virus infection is automatically replicated to all of the redundant hardware components and can neither be prevented nor recovered from when using such technologies.
To overcome this problem, backup technologies have been developed to retain multiple versions of a production system over time. This allowed administrators to restore previous versions of data and to recover from data corruption.
One type of data protection system involves making point in time (PIT) copies of data. A first type of PIT copy is a hardware-based PIT copy, which is a mirror of a primary volume onto a secondary volume. The main drawbacks of the hardware-based PIT copy are that the data ages quickly and that each copy takes up as much disk space as the primary volume. A software-based PIT, or so called “snapshot,” is a “picture” of a volume at the block level or a file system at the operating system level.
It is desirable to generate a snapshot when an application or a file system is in a consistent state because it alleviates the need to replay a log of write streams and allows applications to be restarted rapidly. In order to achieve this, prior art systems suspend an application to update source data and flushes the source data to primary storage before generating a snapshot. However, this method is not efficient because the system has to be suspended for a while in order to generate a snapshot. PIT systems also inefficiently require that the entire snapshot be restored in order to recover specific data. However, it is sometimes desirable to recover a specific file, email data, or the like. This may require recovering a parsed version of a snapshot. For email data, the user may also have to manually set up an email application on top of the recovered snapshot in order to read the recovered email data.
Therefore, there is a need for a method and system for generating a snapshot in a consistent state without suspending an application or a system and for restoring email data from a snapshot in a consistent state.