1. Field of the Invention
The present invention is related to online backup and restoration of an Microsoft Exchange server, and more particularly, to working with backed up data without restoration of the data to the data storage device of a computer system.
2. Description of the Related Art
Electronic mail has become an invaluable application for most of the enterprises over a past decade. However, initially it was difficult to implement reserve copying (i.e., backup) and restoration of the email-related information. If the system failed, some data was not recoverable, thus causing significant damage to an enterprise. Therefore, it is very important for the enterprises to implement effective means for reserve copying and recovery of data without producing a significant operational overhead.
An MS Exchange Server is a distributed database and an information exchange environment having a functionality of a mail client intended for internal use (i.e., information exchange among employees of an enterprise). Most of the critical information of the enterprise is stored in the database of the MS Exchange Server. In order to protect this information in case of system failure or user error, a regular backup of the MS Exchange mail data is needed.
Microsoft Clustering allows preventing system crashes by employing additional system resources called nodes that are used with a central cluster manager. The central cluster manager coordinates load balancing and data activity. Typically, the nodes use common storage devices and can take a load off of a failed node. Two types of cluster environments exist: an active/active environment and an active/passive environment. In the active/active environment, each node of the environment is active and able to process requests. If one of the active nodes fails, the other active nodes begin to process more requests (i.e., the load is redistributed among the remaining nodes).
In the active/passive cluster environment, only one active node exists. This active node processes all incoming requests. If the active node fails, the passive node is activated by the cluster manager and the incoming requests are processed by this node. In this case the system uses additional hardware and produces significant operational overhead.
A Microsoft Exchange Server supports both, an active/active cluster environment and an active/passive cluster environment. Clusterization of the MS Exchange Server provides a very high stability that is generally not affected by failure of individual nodes. However, it does not provide any security against failures or corruptions of data storages. Typically, for a particular size of a cluster environment, plurality of hard disks is used for creation of storage of a desired size.
FIG. 1 illustrates a flow chart of a standard backup process of MS Exchange Server. The process of reserve on-line copying (i.e., run-time backup) of MS Exchange Server is started by launching a reserve copying application. This application calls a Web Storage System Service (WSS) indicating the desired type of reserve copying. Then, the backup process is started at step 102 by marking a synchronization point of the backup.
A this point the WSS informs an Exchange Server Engine (ESE) that it is in a reserve copying mode and an empty correction file (“.PAT”) is generated in step 104 for each database being backed up (in case of a full backup). Note that during the process of full online backup, the database is open for access and transactions can still be recorded in the database. If a transaction invokes an operation for an already copied database file (.edb) over the point of the reserve copying (i.e., a flag in “.edb” file indicating what is already copied and what is not yet copied), the page before the boarder is written into a correction file “.PAT”.
A separate “.PAT” file for each backed up database is used. These files are used only for a full backup and recovery. The correction files are not generated for incremental or additional backups. When the ESE is placed into a reserve copying mode, a new log file is open. For example, if a log file “edb.log” is currently open, it is closed and renamed at step 106 in order to correspond to the latest set of transactions. Then, a new file “edb.log” is created at step 110. At this point the ESE can truncate the logs after completion of reserve copying.
The reserve copying is performed at step 112, and the ESE is requested to read the database and re-order pages. After the pages are ordered at step 114 by creating a correction file (.PAT) for each database and writing the database header into the correction file, they are grouped into fragments of 64 KB each (i.e., 16 pages) and loaded into the operating memory. The operations corresponding to each copied database (“.edb” file) are also written into the “.PAT” file at step 116. Then, the ESE checks a checksum for each page in order to confirm the integrity of data at step 118.
A checksum can be in a form of a hash string consisting of 4-bite segments. Such a hash string is generated and added to each page of the database for controlling the integrity of the page. The original checksum of the page is compared to a checksum of the page read into RAM. Thus, it is verified that the data read out of the database and the data written into the database are identical.
A typical database page structure has the first 82 bites allocated for page headers and flags indicating page type, as well as information about the data types contained in the page. When the page is loaded into memory, the checksum is calculated and a page number is verified. If the checksum does not coincide with the original checksum, then the page is corrupted. The ESE, in this case, will return an error, the database will be suspended and information about the corrupted page will be recorded in the transaction log.
Note that the ESE does not cause page corruption, but only notifies about corrupted pages. The page corruption takes place when data is written onto a disk. In most cases corruption of database pages is caused by equipment or device driver failure. It is important to make sure that the drivers are updated and the latest versions are used. The Microsoft Product Support Services (PSS) can work with equipment manufacturers for solving all problems existing between their equipment and the Exchange Server database.
Comparing the checksums prevents from storing corrupted data. Thus, a successfully created reserve copy of the MS Exchange Server database is definitely not corrupted, since each individual page is checked prior to copying it into a reserve database.
After the reserve copying is completed and all pages are read by a reserve copy utility, it copies the logs and correction files onto reserve storage at step 120. Then, the logs are truncated or deleted at step 122 when a new generation of files is created at the beginning of reserve copying. Also, the old correction files are erased from the disk at step 124. Then, the reserve files are closed, the ESE goes into a normal mode of operation and the reserve copying is completed at step 126. Note that in case of incremental or additional backups, only log files are used. Any operations using correction files, checksums and sequential reading of pages are not performed.
In a network or distributed system environment employing a large number of devices, a malfunction is highly likely. If a hard disk gets corrupted, the applications do not execute correctly. However, all the nodes in the cluster can be using this hard disk as a common storage containing all files (including the files of the MS Exchange Server database). Thus, an enterprise remains vulnerable to the MS Exchange Server crashes.
In light of increasing dependency of the enterprises on MS Exchange Servers, users need some means for overcoming MS Exchange Server crash or malfunction in case of remote and local servers. Such means can protect an enterprise from the consequences of a massive MS Exchange Server crash.
A process of restoring an MS Exchange Server database takes a long time. During this time the MS Exchange Server is suspended until the database is fully restored and the logs are not applied to the database. Therefore, a method for rapid backup of the MS Exchange Server and data restoration without the MS Exchange Server being suspended is desired. It is also desired to be able launch the MS Exchange Server if the database is not restored or restored partially.