This invention relates to the recovery of data from corrupt files. More particularly, this invention relates to a method and system for repairing corrupt files and recovering data, while the files are loaded into a spreadsheet application program.
Files that users may attempt to load into spreadsheet application programs (SAP) can become corrupt for several reasons. These reasons include bugs in the SAP, bugs in other applications that can be used to edit files generated with the SAP, network connectivity problems, viruses, and anti-virus software. The corruption of a file may range from minor to severe data corruption. Furthermore, the file corruption may not be noticeable to the user, but the file corruption may cause specific features within the SAP to work improperly. The corruption may also cause loss of data or make it impossible for the user to open a SAP workbook.
In the past, users and product support personnel have used a number of methods for repairing files and recovering data. However these methods have many limitations. For example, a user can use a hex-editor to open the corrupt file and look for common problems. This method can be used to find problems like a missing end-of-file marker, but it cannot be used to recover data from a file with a corrupt OLE storage structure.
Furthermore, this method requires additional software (the hex editor), intimate knowledge of the SAP binary file format and knowledge of the common types of file corruption. Because of these technical hurdles, it is unlikely that users will successfully use this method.
Another method that is used to repair files and recover data involves saving the file in a different format and then re-opening the file in the SAP. The drawbacks to this method are that it is only useful if the workbook can be opened, and that it causes the loss of any data that is unsupported by the different format.
Yet another method that is used to repair files and recover data from corrupt files involves copying the content of the file to a new workbook. The drawbacks to this method are that it is only useful if the workbook can be opened and that it is a tedious and time consuming process to copy all the information from the corrupt workbook into a new workbook.
Still another method that is used to repair files and recover data from corrupt files involves importing the file into another application. Again, one of the drawbacks to this method is that it is only useful if the workbook can be opened. Another drawback to this method is that formulas, formatting, and other features within the file are normally lost.
In yet another method for repairing files and recovering corrupt data, the user utilizes a third party utility that generally only extracts data and does not save formatting, embedded objects, codes, etc. These third party utilities will recover some data, but they do not repair files or recover an extensive amount of data.
Thus, there is a need in the art for a spreadsheet application program that can repair corrupt files and is sufficiently robust to recover an extensive amount of data, including formulas, formatting, autofilters, charts, Visual Basic program modules, embedded objects, PivotTable reports, Query tables, and data validation.
There is a further need for a spreadsheet application program that can extract extensive amounts of data from corrupt files that cannot be opened.
There is yet a further need for a spreadsheet application program that incorporates file repair and data extraction, thereby eliminating the need for third party utilities.
The present invention satisfies the needs described above by providing a method and system for repairing and recovering data from corrupt files that are being loaded into a spreadsheet application program (SAP), such as Microsoft Excel(copyright). More specifically, the present invention is a SAP that recovers and repairs corrupt files through the automatic escalation of three loading modes, whereby the SAP attempts to load the file in one mode, and if that mode fails it attempts to load the file in another mode, and if that mode fails it attempts to load the file in yet another mode.
In the first mode (normal load mode), the SAP opens files in a manner known to those skilled in the art. The SAP, in this mode, only conducts a few checks while opening a file. The SAP only performs checks that do not noticeably affect performance during file loading. As a result of the number of checks performed, undetected file corruption can cause the SAP to fail to open the file or crash. In other cases, normal load can open the file, but the user will see an error message or will be unable to use one or more features.
In the second mode (safe load mode), the SAP attempts to repair the corrupt file. While opening the file, the SAP performs numerous checks on the file. When corruption is detected that the SAP is able to repair, the file is altered. For example, the SAP may remove charts or PivotTable reports, rename sheets, or reset internal variables. In general, the SAP removes the parts of the file that are corrupt and then opens the remaining parts of the file that are intact.
In the third mode (recovery mode), the SAP attempts to read the table of cell values and formulas, but does not attempt to keep any other parts of the file. For example, the SAP may not recover formatting, charts, VBA code, and embedded objects. The data recovery mode is successful and of great benefit to users in cases where the file is too corrupt to be repaired or opened normally.
The data recovery mode is useful because the SAP files are often used to store large amounts of raw data. In some cases, data may be difficult or impossible to recreate, whereas the charts and formatting are generally easier to recreate if the underlying data is recovered. Therefore, extracting data and formulas from a corrupt file is a benefit to the user and can prevent a complete loss of the contents of a badly damaged file.
These and other features, advantages, and aspects of the present invention may be more clearly understood and appreciated from a review of the following detailed description of the disclosed embodiments and by reference to the appended drawings and claims.