Microsoft® Exchange is a messaging and collaboration software system that provides a variety of applications for group interaction using networked computer systems. Specifically, Microsoft Exchange (available from Microsoft Corporation of Redmond, Wash.) provides support for electronic mail (e-mail) over various networks. To that end, the Exchange software provides an e-mail server to support remotely connected e-mail clients such as, e.g., Microsoft Outlook®. The Exchange software acts as a server for providing various functionalities to clients. An Exchange server can run on a variety of operating systems including, for example, the Microsoft Windows® 2000 operating is system.
In a typical configuration, Microsoft Exchange stores data, organized as databases, associated with e-mail services in two files. In the particular example of Microsoft Exchange 2000 the databases are an .edb file and a .stm file. In each Microsoft Exchange 2000 database, the .edb file is a properties file and the .stm files hold streaming data. The streaming data file contains raw content that has been received via, for example, the Internet, and is stored in its native format. Pointers are created by the Exchange server within the .edb file to reference the various messages or data stored within the .stm file. The default storage locations for these databases are on a disk locally connected to the computer on which the Exchange software is running.
FIG. 1 is a flow chart illustrating a path of an exemplary e-mail passing through an Exchange server. In step 105, the electronic mail is received via conventional e-mail processes. These processes can include the use of such protocols as Simple Mail Transport Protocol (SMTP). Next, in step 110, the message is stored in the memory of a database server. The storage of the message in memory is often transient in nature until the message is committed to some form of nonvolatile storage. The e-mail message is then written to a log file in step 115. The log file typically has a preallocated size, for example 5 megabytes (MB) in size. When the current log file reaches the preallocated size, the database server creates a new log file. Thus, an Exchange server may have a variable number of log files at any given point-in-time, depending on how many log files have been incorporated into the database files. Next, the log files are written to and incorporated into the database files, in step 120. The writing of the log file to database occurs in a lazy write fashion. A “lazy write” is a writing process or procedure of the Exchange software that performs a write operation when central processing unit cycles are available. Thus, this lazy write proceeds typically during off-peak times when the server is not being heavily utilized.
FIG. 2 is a schematic block diagram of an exemplary Exchange server environment 200. An exemplary server 205 executing, e.g., the Microsoft Windows 2000 operating system containing a local disk 210 is shown connected to a backup tape drive 220 and an external disk 215. The external tape drive 220 is connected via either a small computer system interface (SCSI) connection or a switching network, such as storage area network (SAN). Similarly, the external disk 215 may be connected via a SAN or other suitable networking architecture. The Exchange server 205 may be incorporated into a Microsoft Clustering System (MSCS) environment 225 that provides redundant data program access to clients. Additionally, the Exchange server 205 is operatively interconnected with a network 230. The network 230 may be a local area network (LAN), a wide area network (WAN), a virtual private network (VPN) or any other suitable networking scheme. Connected to the network 230 is a number of clients 235, each of which utilizes the services of the Exchange server 205 by passing Exchange commands and data to the server 205 over the network 230.
In a known example of an Exchange server, the Exchange software provides an application program interface (API) that is accessible by other programs executing on the server for performing backup and restore operations on the various databases. Other applications or processes executing on the server can access these APIs to perform various backup/restore operations. These APIs are targeted toward the use of a tape drive as a backup storage device. Such backup operations are normally performed while the Exchange server is operating. As tape drives typically have a slower read/write time than disk drives, the backup of databases with a tape device can consume a significant amount of time. Although the Exchange server is operational during as backup operation, performance is degraded during the course of the backup operation. Due to the extended degradation caused by the use of tape devices a backup storage media, backups are typically performed at night (or other off-peak time), when few users are utilizing the system. Similarly, a restore operation using a tape device consumes a substantial amount of time to restore the databases. When performing a backup or restore operation, the database files and any unincorporated logs need to be saved and/or restored. Thus as the sizes of the various database files increase, the time required to perform a backup/restore operation to a tape device also increases.
In a further known example, the Exchange server is adapted to have the database and log files preferably written to a local disk. However, by utilizing other software products such as, e.g., SnapDrive® available from Network Appliance, Inc. of Sunnyvale, Calif., the log files and databases may be written to a virtual logical disk (VLD) stored on disks connected to a file server. In this example, the other software product replaces a block protocol data access driver executing on the Exchange server with one that is adapted to support VLD operations. These VLD and modified driver, described further below, are also described in U.S. patent application Ser. No. 10/188,250, entitled SYSTEM AND METHOD FOR MAPPING BLOCK-BASED FILE OPERATIONS TO FILE LEVEL PROTOCOLS, by Dennis E. Chapman, the contents of which are hereby incorporated by reference.
A file server is a computer that provides file service relating to the organization of information on storage devices, such as disks. The file server or filer includes a storage operating system that implements a file system to logically organize the information as a hierarchical structure of directories and files on the disks. By “file system” it is meant generally a structuring of data and metadata on storage devices, such as disks, which permits reading/writing of data on those disks. A file system also includes mechanisms for performing these operations. Each “on-disk” file may be implemented as a set of disk blocks configured to store information, such as text, whereas the directory may be implemented as a specially-formatted file in which information about other files and directories are stored. A filer may be configured to operate according to a client/server model of information delivery to thereby allow many clients to access files stored on a server, e.g., the filer. In this model, the client may comprise an application, such as a file system protocol, executing on a computer that “connects” to the filer over a computer network, such as a point-to-point link, shared LAN, WAN, or VPN implemented over a public network such as the Internet. Each client may request the services of the filer by issuing file system protocol messages (in the form of packets) to the filer over the network.
A common type of file system is a “write in-place” file system, an example of which is the conventional Berkeley fast file system. In a write in-place file system, the locations of the data structures, such as inodes and data blocks, on disk are typically fixed. An inode is a data structure used to store information, such as meta-data, about a file, whereas the data blocks are structures used to store the actual data for the file. The information contained in an inode may include, e.g., ownership of the file, access permission for the file, size of the file, file type and references to locations on disk of the data blocks for the file. The references to the locations of the file data are provided by pointers, which may further reference indirect blocks that, in turn, reference the data blocks, depending upon the quantity of data in the file. Changes to the inodes and data blocks are made “in-place” in accordance with the write in-place file system. If an update to a file extends the quantity of data for the file, an additional data block is allocated and the appropriate inode is updated to reference that data block.
Another type of file system is a write-anywhere file system that does not over-write data on disks. If a data block on disk is retrieved (read) from disk into memory and “dirtied” with new data, the data block is stored (written) to a new location on disk to thereby optimize write performance. A write-anywhere file system may initially assume an optimal layout such that the data is substantially contiguously arranged on disks. The optimal disk layout results in efficient access operations, particularly for sequential read operations, directed to the disks. A particular example of a write-anywhere file system that is configured to operate on a filer is the Write Anywhere File Layout (WAFL™) file system also available from Network Appliance, Inc. of Sunnyvale, Calif. The WAFL™ file system is implemented within a microkernel as part of the overall protocol stack of the filer and associated disk storage. This microkernel is supplied as part of Network Appliance's Data ONTAP™ storage operating system, residing on the filer, that processes file-service requests from network-attached clients.
As used herein, the term “storage operating system” generally refers to the computer-executable code operable on a storage system that implements file system semantics and manages data access. In this sense, Data ONTAP™ software is an example of such a storage operating system implemented as a microkernel. The storage operating system can also be implemented as an application program operating over a general-purpose operating system, such as UNIX® or Windows NT®, or as a general-purpose operating system with configurable functionality, which is configured for storage applications as described herein.
Disk storage is typically implemented as one or more storage “volumes” that comprise physical storage disks, defining an overall logical arrangement of storage space. Currently available filer implementations can serve a large number of discrete volumes (150 or more, for example). Each volume is associated with its own file system and, for purposes hereof, volume and file system shall generally be used synonymously. The disks within a volume are typically organized as one or more groups of Redundant Array of Independent (or Inexpensive) Disks (RAID). RAID implementations enhance the reliability/integrity of data storage through the redundant writing of data “stripes” across a given number of physical disks in the RAID group, and the appropriate caching of parity information with respect to the striped data. In the example of a WAFL-based file system, a RAID 4 implementation is advantageously employed. This implementation specifically entails the striping of data across a group of disks, and separate parity caching within a selected disk of the RAID group. As described herein, a volume typically comprises at least one data disk and one associated parity disk (or possibly data/parity partitions in a single disk) arranged according to a RAID 4, or equivalent high-reliability, implementation.
As described in the above-incorporated United States patent application, a client of a file server may utilize a data access protocol driver that implements VLDs on a file server. The data access protocol driver supplements the traditional protocol layer stack of the client's operating system. Illustratively, the VLD stores data according to the file system semantics of the client. Thus, in the example noted above, the VLD stores data using the conventional NT File System (NTFS). Notably, the file embodying a VLD is sized to the storage size of the virtual logical disk, for example tens of gigabytes. Each VLD stored on the file server illustratively utilizes a set naming convention. For example, the file is named “XXXX.VLD” where “XXXX” is a unique identifier associated with the client which created the virtual logical disk. It is expressly contemplated that other naming conventions can be utilized with the present invention and as such the naming convention described herein is exemplary only.
Broadly stated, when the file system of a client issues a block access request to access data, the data access protocol driver, executing on the client determines whether the request is directed to a physical disk or to a VLD. If the request is directed to a disk, then the data access protocol driver forwards the requested block access operation on to that disk. In these instances, the data access protocol driver functions similar to a traditional block-based protocol driver, e.g., a SCSI driver. Otherwise, the block access protocol driver maps the block access request to a file access request and forwards that request to the file server using a file access protocol, such as the conventional Network File System (NFS). In response, the file server performs the requested operation to the file and returns the results to the client using the file access protocol. The data access protocol driver then maps the file access response to a block access response and returns that response to the file system.
A file server, as described above may be interconnected by a network to an Exchange or other database server to provide file service operations. In the example of an Exchange database server, various database files may be stored on VLDs managed by a file server. As noted, the file server typically utilizes a tape device for backup/restore operations and a substantial amount of time is required to perform a backup operation to a tape device. Consequently, many system administrators do not frequently perform backup operations, thus preventing system performance degradation due to the ongoing backup operation. Yet, to restore a database to a particular point-in-time, the administrator typically requires a backup of the file system or database files generated at the desired point-in-time. As backups are typically written to tape devices with lengthy intervals between successive backups, the possible selection of discrete points-in-time to restore to is generally limited.
Another noted disadvantage of the prior art is that by taking a snapshot of a VLD, the contents of the VLD are not guaranteed to be consistent. The snapshotting process is described, in further detail in U.S. patent application Ser. No. 09/932,578 entitled INSTANT SNAPSHOT by Lewis et al. By “snapshot” it is meant generally a rapid generation of an image of the data at a certain point-in-time. Snapshot is a trademark of Network Appliance Inc. It is used for purposes of this patent to designate a persistent consistency point (CP) image. A persistent consistency point image (PCPI) is a point-in-time representation of the storage system, and more particularly, of the active file system, stored on a storage device (e.g., on disk) or another persistent memory and having a name or other identifier that distinguishes it from other PCPIs taking in other points-in-time. A PCPI can also include other information (metadata) about the active file system at the particular point-in-time for which the image is taken. The terms (PCPI) and (snapshot) shall be used interchangeably throughout this patent without derogation of Network Appliance's trademark rights. For example, various buffers in the file system, protocol driver, or application of the server that is writing data to the VLD may still contain data that has not been written to the VLD. This is due to the fact that the client file system, for example NTFS, is unaware of the snapshot capabilities of the underlying VLD. Thus, simply generating a snapshot of a VLD at a given point-in-time does not guarantee that all data currently associated with a database is captured by the snapshot.